CN112818840A - Unmanned aerial vehicle online detection system and method - Google Patents
Unmanned aerial vehicle online detection system and method Download PDFInfo
- Publication number
- CN112818840A CN112818840A CN202110127611.2A CN202110127611A CN112818840A CN 112818840 A CN112818840 A CN 112818840A CN 202110127611 A CN202110127611 A CN 202110127611A CN 112818840 A CN112818840 A CN 112818840A
- Authority
- CN
- China
- Prior art keywords
- module
- convolution
- feature
- network
- aerial vehicle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 63
- 238000000034 method Methods 0.000 title claims abstract description 58
- 230000007246 mechanism Effects 0.000 claims abstract description 17
- 238000012216 screening Methods 0.000 claims abstract description 12
- 230000005764 inhibitory process Effects 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims abstract description 9
- 238000013139 quantization Methods 0.000 claims abstract description 8
- 238000010276 construction Methods 0.000 claims abstract description 6
- 238000013527 convolutional neural network Methods 0.000 claims description 35
- 238000011176 pooling Methods 0.000 claims description 22
- 238000010586 diagram Methods 0.000 claims description 21
- 238000013528 artificial neural network Methods 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 13
- 230000006870 function Effects 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 11
- 230000004913 activation Effects 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 9
- 230000001629 suppression Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 4
- 230000006872 improvement Effects 0.000 claims description 4
- 230000003213 activating effect Effects 0.000 claims description 3
- 238000013461 design Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000011002 quantification Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 235000004522 Pentaglottis sempervirens Nutrition 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Astronomy & Astrophysics (AREA)
- Remote Sensing (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention provides an unmanned aerial vehicle online detection system and method, which comprises the following steps: preprocessing an input image by data random rotation enhancement and construction of a sub-convolution network, and replacing a large convolution kernel by a plurality of small convolution kernels so as to prevent the loss of extracted characteristic information when the parameter quantity is reduced; improving a main network of a Fast R-CNN module, constructing a transverse connection dual-channel network added with an attention mechanism as the main network of the Fast R-CNN module, combining the first characteristic with the second characteristic, and improving the detection and identification performance of the remote sensing image on the basis of not increasing the calculated amount of the original model; the first feature is a high-level feature with a first resolution and a first feature semantic information amount; the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount; detecting the rotated target image by using an area converter; screening candidate frames by adopting an inhibition operator; the method prevents the region mismatching caused by the RoIPolling quantization operation by using a bilinear interpolation method through a RoI Align module.
Description
Technical Field
The invention relates to the technical field of remote sensing information, in particular to an unmanned aerial vehicle online detection system and method.
Background
The on-line target detection and identification of the remote sensing image is one of the difficult problems in the field of computer vision, and the detection and identification of the remote sensing target are difficult due to the characteristics of the remote sensing target, so that the following difficulties exist: firstly, because the data volume of the remote sensing image is huge, how to quickly and accurately detect the remote sensing target is a difficult problem; secondly, the remote sensing target is shielded by the cloud layer; thirdly, low resolution and blurred images. The information reflected by the remote sensing target from the image is less, the feature expression capability is weaker, and fewer features can be extracted in the feature extraction process, so that the detection and identification effect is poor, and the accuracy is low.
Disclosure of Invention
The invention aims to provide an unmanned aerial vehicle online detection system and method, and aims to solve the problems that the existing remote sensing image online target detection and identification is low in accuracy and poor in effect.
In order to solve the technical problem, the invention provides an online detection method for an unmanned aerial vehicle, which comprises the following steps:
preprocessing an input image by data random rotation enhancement and construction of a sub-convolution network, and replacing a large convolution kernel by a plurality of small convolution kernels so as to prevent the loss of extracted characteristic information when the parameter quantity is reduced;
the method comprises the following steps of improving a main network of a Fast R-CNN module, and constructing a transverse connection dual-channel network added with an attention mechanism as the main network, wherein the method comprises the following steps:
the first characteristic and the second characteristic are combined, and the remote sensing image detection and identification performance is improved on the basis of not increasing the calculation amount of the original model;
the first feature is a high-level feature with a first resolution and a first feature semantic information amount;
the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount;
the first resolution is lower than the second resolution, and the first characteristic semantic information quantity is larger than the second characteristic semantic information quantity;
detecting the rotated target image by using an area converter;
screening candidate frames by adopting an inhibition operator;
the method comprises the steps of preventing region mismatching caused by the RoI Polling quantization operation by using a bilinear interpolation method through a RoI Align module;
and finishing the combination of unmanned aerial vehicle remote sensing and a deep neural network so as to detect and identify the target of the acquired image.
Optionally, in the online detection method of the unmanned aerial vehicle, further comprising,
acquiring a remote sensing image, and performing data enhancement on the remote sensing image;
inputting the enhanced remote sensing image into a sub-convolution network for preprocessing to obtain a preprocessed characteristic diagram;
inputting the preprocessed feature map into a transverse connection dual-channel network added with an attention mechanism for feature extraction, and inputting the feature map obtained by convolution into an RPN module and a Fast R-CNN module;
generating a frame in an RPN module, converting the generated horizontal frame into a directional rotating frame through a regional converter, and further screening the generated frame through a suppression operator;
the selected region after screening is subjected to RoI Align processing in a Fast R-CNN module, and then classification and frame regression are carried out through the output of a full connection layer;
respectively carrying out reverse gradient propagation on the RPN module and the Fast R-CNN module according to the loss function result in the RPN module and the Fast R-CNN module, and adjusting the weight of network parameters in the training process;
and repeating the steps, continuously iterating the training process until the parameters of the network are converged, and embedding the trained neural network into a storage device of the unmanned aerial vehicle, so that the unmanned aerial vehicle can carry out online remote sensing target detection.
Optionally, in the online detection method of an unmanned aerial vehicle, the data random rotation enhancement includes:
performing random rotation enhancement on the categories of which the number of samples is less than the sample number threshold, wherein the rotation angles comprise 90 degrees, 180 degrees and 270 degrees;
a series of small blocks of 1024 x 1024 pixel size are cropped from the input image for training.
Optionally, in the online detection method of the unmanned aerial vehicle, constructing a sub-convolution network to pre-process the input image includes:
extracting image characteristic information in a convolution mode to replace a large convolution kernel with a plurality of small convolution kernels;
the sub-convolutional network comprises three convolutional layers with convolutional kernel size of 3 x 3, where the step size of two convolutional layers is 2 and the step size of the other convolutional layer is 1.
Optionally, in the online detection method of an unmanned aerial vehicle, the improving the backbone network includes:
constructing a dual-channel network:
the first path passes through an average pooling layer with convolution kernel size of 2 x 2 and step size of 2 and 1 convolution layer with convolution kernel size of 1 x 1, so as to avoid information loss;
the second path passes through a convolution layer with convolution kernel size of 1 x 1, then passes through 1 convolution layer with convolution kernel size of 3 x 3 and step size of 2, and finally passes through 1 convolution layer with convolution kernel size of 1 x 1;
and adding the results of the first path and the second path to obtain a convolution characteristic diagram F, and sending the convolution characteristic diagram F to the attention mechanism module.
Optionally, in the online detection method of an unmanned aerial vehicle, the method further includes: the attention mechanism module calculates an attention map from two dimensions of space and a channel;
the channel attention module performs pooling operation on the input convolution characteristic diagram F through a maximum pooling layer and an average pooling layer respectively, then inputs the convolution characteristic diagram F into the multi-layer sensor MLP, sums two characteristics output by the multi-layer sensor MLP according to elements, and then generates a channel attention characteristic diagram C through sigmoid activation;
multiplying the channel attention feature map C and the input convolution feature map F element by element to obtain a product S1, and inputting the product S1 into a space attention module;
performing pooling operation on the maximum pooling layer and the average pooling layer, splicing the two results based on channels, performing dimensionality reduction processing through convolution operation, activating through a sigmoid activation function to generate an activation feature map S2, and multiplying the product S1 and the activation feature map S2 by elements to generate a spatial attention feature map S.
Optionally, in the online detection method of an unmanned aerial vehicle, the method further includes: constructing a transverse connection network, inputting the obtained attention feature map S into a dual-channel network to obtain a feature map F ', and obtaining the feature map F ' by convolving the feature map F ' with a convolution kernel with the size of 1 × 1*By making a convolution with the feature map F*And performing top-to-bottom up-sampling and transverse connection with a feature map F obtained by the last double-channel network convolution to combine high-level features with low resolution and rich feature semantic information with low-level features with high resolution and less feature semantic information and improve the performance of detecting and identifying the remote sensing image.
Optionally, in the online detection method of an unmanned aerial vehicle, the method further includes:
introducing a regional converter module into the RPN module to extract a target in a remote sensing image with any direction;
the area converter module comprises a learning module and a deformation module;
the learning module comprises a PS RoI Align layer, a full connection layer and a decoder;
the full-link layer outputs the rotated true value relative to the horizontal candidate identification region and the offset, and the decoder outputs the horizontal candidate identification region and the offset (t)*) As input, the input horizontal candidate recognition area and the true value (x) of the rotation are inputted*,y*,w*,h*,θ*) Matching is performed, and the decoded rotated recognition candidate region (x, y, w, h, θ) is output by the following formula, howeverThen, the convolution feature map F and the rotated candidate identification area are used as input and transmitted to a deformation module for feature extraction;
optionally, in the online detection method for the unmanned aerial vehicle, an inhibition operator is added to the RPN module, and the design flow of the inhibition operator is as follows:
the set B is all candidate frames, S is the score values of all candidate frames, N is a set threshold value, and D is an empty set for storing the screened candidate frames;
step 2, sorting according to the scores of the candidate frames, adding the candidate frame with the highest score into the set D, recording the candidate frame as M, and removing M from the set B;
step 4, repeating the step 1;
step 5, returning the candidate frame set D and the candidate frame score S;
step 6, removing the candidate frames with the scores smaller than N from the sets D and S;
the attenuation function F is formulated as:
ac is both Ka、KbMinimum area of two frames, U being two frames Ka、KbThe union of (a) and (b).
Optionally, in the online detection method of an unmanned aerial vehicle, the method further includes: and modifying the RoI Polling of the Fast R-CNN module into RoI Align, and solving the problem of region mismatching caused by the quantification operation of the RoI Polling in the original Fast R-CNN network by using a bilinear interpolation method.
Optionally, in the online detection method for the unmanned aerial vehicle, the method further includes applying an improved neural network to the unmanned aerial vehicle;
the unmanned aerial vehicle comprises an unmanned aerial vehicle body, a processing device, a storage device and a small pixel camera, wherein the pixel size of the small pixel camera is 1-2 um;
shooting in real time by using a small pixel camera to obtain a high-resolution remote sensing image;
and the processing device calls a neural network which is stored in the storage device and trained by adopting the improved method to process the high-resolution remote sensing image shot by the small-pixel camera so as to realize online detection.
The invention also provides an online detection system for the unmanned aerial vehicle, which comprises:
the preprocessing module is configured to preprocess the input image through data random rotation enhancement and construction of a sub-convolution network, and replace a large convolution kernel with a plurality of small convolution kernels so as to prevent the loss of the extracted feature information when the parameter quantity is reduced;
the main network improvement module is configured to improve the main network of the Fast R-CNN module and construct a transverse connection dual-channel network added with an attention mechanism as the main network;
the first characteristic and the second characteristic are combined, and the remote sensing image detection and identification performance is improved on the basis of not increasing the calculation amount of the original model;
the first feature is a high-level feature with a first resolution and a first feature semantic information amount;
the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount;
the first resolution is lower than the second resolution, and the first characteristic semantic information quantity is larger than the second characteristic semantic information quantity;
an RPN module configured to detect a rotated target image using a region converter;
screening candidate frames by adopting an inhibition operator;
the classification and frame regression improvement module is configured to improve the classification and frame of the Fast R-CNN module, and the region mismatching caused by the RoI Polling quantization operation is prevented by using a bilinear interpolation method through the RoI Align module;
and the target detection and identification module is configured to complete the combination of unmanned aerial vehicle remote sensing and a deep neural network so as to perform target detection and identification on the acquired image.
The inventor of the present invention has found through research that the early conventional method for target detection in remote sensing images follows a two-stage detection paradigm: 1) candidate extraction 2) target verification. In the candidate extraction stage, some common methods are a gray value filtering-based method, a wavelet transform-based method, an anomaly detection-based method, a visual saliency-based method, and the like. In the target verification stage, some common methods are HOG, LBP, SIFT, etc. With the development of computer vision technology, deep neural network technology is also applied to remote sensing target detection. In 2016, FastyUncaria Nakamurami et al proposed a target detection algorithm for fast R-CNN (Ren S, He K, Girshick R, et al. fast R-CNN: Towards real-time object detection with region pro-position networks [ J ]. IEEE transactions on pattern analysis and machine analysis, 2016,39(6):1137-1149.) and used convolutional neural network for candidate region generation, which greatly improved the computational efficiency. The Faster R-CNN network can be regarded as an RPN module and a Fast R-CNN module, wherein the RPN module is responsible for generating a target candidate region, and the Fast R-CNN module is responsible for learning the characteristics of the candidate region, classifying the candidate region and regressing a frame. However, there are some problems with the fast R-CNN network: the method mainly aims at a conventional target, and when the remote sensing target is detected, certain compression can be carried out on a characteristic diagram by pooling operation in a characteristic extraction network, so that some information in the characteristic diagram is filtered. Also, unlike conventional images taken from a horizontal angle, the remote sensing image is typically taken from a bird's eye perspective, resulting in an arbitrary direction of the target in the remote sensing image. In addition, the complex background and the changed appearance of the target further increase the difficulty of target detection in the remote sensing image, so that the Faster R-CNN network has poor effect in detecting and identifying the remote sensing target.
Based on the above insights, the invention provides an unmanned aerial vehicle online detection system and method, which combine unmanned aerial vehicle remote sensing and deep neural network technologies to perform target detection and identification on the acquired image. The method comprises the following steps: the data is randomly rotated and enhanced, a sub-convolution network is constructed to preprocess the input image, a plurality of small convolution kernels are used for replacing a large convolution kernel, and the extracted characteristic information is not lost while the parameter quantity is reduced; the method comprises the steps of improving a backbone network, constructing a transverse connection dual-channel network added with an attention mechanism, combining high-level features with low resolution and rich feature semantic information with low-level features with high resolution and less feature semantic information, and improving the performance of detecting and identifying the remote sensing image on the basis of basically not increasing the calculated amount of an original model; a regional converter is introduced, so that a rotated target image can be better detected; proposing an inhibition operator to better screen the candidate frame; and introducing RoI Align, and solving the problem of region mismatching caused by the quantization operation of RoI Polling by using a bilinear interpolation method.
Drawings
Fig. 1 is a schematic diagram of an online detection method for an unmanned aerial vehicle according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a spatial, channel two-dimensional computational attention map spectrum according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a transverse-connection dual-path network according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a local area converter module according to an embodiment of the invention.
Detailed Description
The unmanned aerial vehicle online detection system and method provided by the invention are further described in detail below with reference to the accompanying drawings and specific embodiments. Advantages and features of the present invention will become apparent from the following description and from the claims. It is to be noted that the drawings are in a very simplified form and are not to precise scale, which is merely for the purpose of facilitating and distinctly claiming the embodiments of the present invention.
Furthermore, features from different embodiments of the invention may be combined with each other, unless otherwise indicated. For example, a feature of the second embodiment may be substituted for a corresponding or functionally equivalent or similar feature of the first embodiment, and the resulting embodiments are likewise within the scope of the disclosure or recitation of the present application.
The core idea of the invention is to provide an unmanned aerial vehicle online detection system and method, so as to solve the problems of low accuracy and poor effect of the existing remote sensing image online target detection and identification.
In order to realize the above idea, the present invention provides an online detection system and method for an unmanned aerial vehicle, as shown in fig. 1, including: preprocessing an input image by data random rotation enhancement and construction of a sub-convolution network, and replacing a large convolution kernel by a plurality of small convolution kernels so as to prevent the loss of extracted characteristic information when the parameter quantity is reduced; the main network of the Fast R-CNN module is improved, a transverse connection dual-channel network added with an attention mechanism is constructed to serve as the main network of the Fast R-CNN module so as to combine the first characteristic with the second characteristic, and the detection and identification performance of the remote sensing image is improved on the basis of not increasing the calculated amount of an original model; the first feature is a high-level feature with a first resolution and a first feature semantic information amount; the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount; the first resolution is lower than the second resolution, and the first characteristic semantic information quantity is larger than the second characteristic semantic information quantity; detecting the rotated target image by using an area converter; screening candidate frames by adopting an inhibition operator; the method comprises the steps of preventing region mismatching caused by the RoI Polling quantization operation by using a bilinear interpolation method through a RoI Align module; and finishing the combination of unmanned aerial vehicle remote sensing and a deep neural network so as to detect and identify the target of the acquired image.
And data enhancement, namely performing random rotation enhancement on the categories with small number of samples, and rotating the categories by 90 degrees, 180 degrees and 270 degrees respectively. A series of small blocks of 1024 x 1024 pixel size are then cropped from the image for training.
A sub-convolution network is constructed to preprocess an input image, image characteristic information is extracted in a convolution mode, a plurality of small convolution kernels are used for replacing large convolution kernels, and the extracted characteristic information is guaranteed not to be lost while parameters are reduced. The sub-convolutional network consists of three convolutional layers with convolutional kernel size of 3 x 3, the step size of the first two convolutional layers is 2, and the step size of the last convolutional layer is 1.
Then, the main network of the Fast R-CNN module is improved, and a transverse connection dual-channel network (figure 3) added with an attention mechanism is constructed to serve as the main network. Firstly, jump connection is carried out through double paths, so that the problem of gradient disappearance is solved while the network is deepened. The characteristic diagram is convoluted through two paths, and the result values of the two paths of convolution are added. The path a (first path) passes through an average pooling layer with a convolution kernel size of 2 x 2 and step size of 2 and 1 convolution layer with a convolution kernel size of 1 x 1, in such a way that information loss is avoided. The path B (second path) passes through 1 convolutional layer with convolution kernel size 1 x 1, then through 1 convolutional layer with convolution kernel size 3 x 3 and step size 2, and finally through 1 convolutional layer with convolution kernel size 1 x 1. And adding the two results, and sending the obtained characteristic diagram into an attention mechanism module.
The attention mechanism module calculates an attention map from two dimensions, spatial, channel (fig. 2).
And the channel attention module performs pooling operation on the input feature map F through a maximum pooling layer and an average pooling layer respectively, inputs the feature map F into the multi-layer sensor MLP, sums the two features output by the MLP according to elements, and generates a channel attention feature map C through sigmoid activation.
Multiplying C and the input feature map F element by element to obtain S1, and inputting S1 into a spatial attention module. Performing pooling operation on the maximum pooling layer and the average pooling layer, splicing the two results based on channels, performing dimensionality reduction processing through convolution operation, activating through a sigmoid activation function to generate a feature map S2, and multiplying S1 and S2 by elements to generate a spatial attention feature map S.
And constructing a transverse connection network, and combining high-level features with low resolution and rich feature semantic information with low-level features with high resolution and less feature semantic information by performing top-down sampling on the feature graph generated by convolution and performing transverse connection on the feature graph generated by convolution, so that the performance of detecting and identifying the remote sensing image is improved.
A regional converter module (figure 4) is introduced into the RPN module to better extract a target in a remote sensing image with any direction, and the regional converter module is divided into a learning module and a deformation module. The learning module is composed of a PS RoI Align layer, a fully-connected layer and a decoder. The full-link layer outputs the offset of the true value of the rotation relative to the horizontal candidate identification region, and the decoder outputs the horizontal candidate identification region and the offset (t)*) As input, the input horizontal candidate recognition area and the true value (x) of the rotation are inputted*,y*,w*,h*,θ*) And matching is carried out, the decoded rotated candidate identification region (x, y, w, h and theta) is output through the following formula, and the feature map and the rotated candidate identification region are taken as input to be transmitted to a deformation module for feature extraction.
Adding a suppression operator in the RPN module, wherein the operator design flow is as follows:
the set B is all candidate frames, S is the score value of all candidate frames, and N is a set threshold. D is an empty set for holding the filtered candidate boxes.
And step 1, judging whether the set B is empty, if so, executing step 5, and if not, executing step 2.
And 2, sorting according to the scores of the candidate frames, adding the candidate frame with the highest score into the set D, recording the candidate frame as M, and removing M from the set B.
And 3, traversing all the candidate frames Bi in the set B, calculating the attenuation values of Bi and M through an attenuation function F, and taking the obtained attenuation values as new score values Si of the candidate frames Bi.
And 4, repeating the step 1.
And 5, returning the candidate frame set D and the candidate frame score S.
And 6, removing the candidate frames with the scores smaller than N from the sets D and S.
The attenuation function F is formulated as:
ac is also Ka、KbMinimum area of two frames, U being two frames Ka、KbThe union of (a) and (b).
And modifying the RoI Polling of the Fast R-CNN module into RoI Align, and solving the problem of region mismatching caused by the quantification operation of the RoI Polling in the original Fast R-CNN network by using a bilinear interpolation method.
On being applied to unmanned aerial vehicle with modified neural network, unmanned aerial vehicle includes unmanned aerial vehicle body, processing apparatus, storage device and little pixel camera, and the so-called little pixel is that the size is 1um-2 um. The small-pixel camera is used for shooting in real time to obtain a high-resolution remote sensing image, the obtained remote sensing image is sent to the processing device, and the processing device calls the neural network which is stored in the storage device and trained by the improved method to process the high-resolution remote sensing image shot by the small-pixel camera so as to realize online detection.
The whole process is as follows:
the method comprises the following steps: and acquiring a remote sensing image, and performing data enhancement on the remote sensing image.
Step two: and inputting the enhanced remote sensing image into a sub-convolution network for preprocessing to obtain a preprocessed characteristic diagram.
Step three: inputting the preprocessed feature map into a transverse connection dual-channel network added with an attention mechanism for feature extraction, and inputting the obtained feature map into an RPN module and a Fast R-CNN module.
And fourthly, generating a frame in the RPN module, converting the generated horizontal frame into a directional rotating frame through a regional converter, and further screening the generated frame through a suppression operator.
Step five: the selected region after screening is subjected to RoI Align processing in a Fast R-CNN module, and then classification and frame regression are carried out through the output of a full connection layer;
step six: and respectively carrying out reverse gradient propagation on the RPN module and the Fast R-CNN module according to the loss function result in the RPN module and the Fast R-CNN module, and adjusting the network parameter weight in the training process. And repeating the first step to the sixth step, continuously iterating the training process until the parameters of the network are converged, and embedding the trained neural network into a storage device of the unmanned aerial vehicle, so that the unmanned aerial vehicle can carry out online remote sensing target detection.
In summary, the above embodiments describe in detail different configurations of the online detection system and method for the unmanned aerial vehicle, and of course, the present invention includes, but is not limited to, the configurations listed in the above embodiments, and any content that is transformed based on the configurations provided in the above embodiments falls within the scope of the present invention. One skilled in the art can take the contents of the above embodiments to take a counter-measure.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The above description is only for the purpose of describing the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention, and any variations and modifications made by those skilled in the art based on the above disclosure are within the scope of the appended claims.
Claims (12)
1. An unmanned aerial vehicle online detection method is characterized by comprising the following steps:
preprocessing an input image by data random rotation enhancement and construction of a sub-convolution network, and replacing a large convolution kernel by a plurality of small convolution kernels so as to prevent the loss of extracted characteristic information when the parameter quantity is reduced;
the method comprises the following steps of improving a main network of a Fast R-CNN module, and constructing a transverse connection dual-channel network added with an attention mechanism as the main network, wherein the method comprises the following steps:
the first characteristic and the second characteristic are combined, and the remote sensing image detection and identification performance is improved on the basis of not increasing the calculation amount of the original model;
the first feature is a high-level feature with a first resolution and a first feature semantic information amount;
the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount;
the first resolution is lower than the second resolution, and the first characteristic semantic information quantity is larger than the second characteristic semantic information quantity;
detecting the rotated target image by using an area converter;
screening candidate frames by adopting an inhibition operator;
the method comprises the steps that a bilinear interpolation method is used by a RoIAlign module to prevent region mismatching caused by RoI Polling quantization operation;
and finishing the combination of unmanned aerial vehicle remote sensing and a deep neural network so as to detect and identify the target of the acquired image.
2. The unmanned aerial vehicle online detection method of claim 1, further comprising,
acquiring a remote sensing image, and performing data enhancement on the remote sensing image;
inputting the enhanced remote sensing image into a sub-convolution network for preprocessing to obtain a preprocessed characteristic diagram;
inputting the preprocessed feature map into a transverse connection dual-channel network added with an attention mechanism for feature extraction, and inputting the feature map obtained by convolution into an RPN module and a Fast R-CNN module;
generating a frame in an RPN module, converting the generated horizontal frame into a directional rotating frame through a regional converter, and further screening the generated frame through a suppression operator;
the selected region after screening is subjected to RoI Align processing in a Fast R-CNN module, and then classification and frame regression are carried out through the output of a full connection layer;
respectively carrying out reverse gradient propagation on the RPN module and the Fast R-CNN module according to the loss function result in the RPN module and the Fast R-CNN module, and adjusting the weight of network parameters in the training process;
and repeating the steps, continuously iterating the training process until the parameters of the network are converged, and embedding the trained neural network into a storage device of the unmanned aerial vehicle, so that the unmanned aerial vehicle can carry out online remote sensing target detection.
3. The unmanned aerial vehicle online detection method of claim 1, wherein the data random rotation enhancement comprises:
performing random rotation enhancement on the categories of which the number of samples is less than the sample number threshold, wherein the rotation angles comprise 90 degrees, 180 degrees and 270 degrees;
a series of small blocks of 1024 x 1024 pixel size are cropped from the input image for training.
4. The unmanned aerial vehicle online detection method of claim 1, wherein constructing a sub-convolutional network to preprocess the input image comprises:
extracting image characteristic information in a convolution mode to replace a large convolution kernel with a plurality of small convolution kernels;
the sub-convolutional network comprises three convolutional layers with convolutional kernel size of 3 x 3, where the step size of two convolutional layers is 2 and the step size of the other convolutional layer is 1.
5. The unmanned aerial vehicle online detection method of claim 4, wherein improving the backbone network comprises:
constructing a dual-channel network:
the first path passes through an average pooling layer with convolution kernel size of 2 x 2 and step size of 2 and 1 convolution layer with convolution kernel size of 1 x 1, so as to avoid information loss;
the second path passes through a convolution layer with convolution kernel size of 1 x 1, then passes through 1 convolution layer with convolution kernel size of 3 x 3 and step size of 2, and finally passes through 1 convolution layer with convolution kernel size of 1 x 1;
and adding the results of the first path and the second path to obtain a convolution characteristic diagram F, and sending the convolution characteristic diagram F to the attention mechanism module.
6. The unmanned aerial vehicle online detection method of claim 5, further comprising: the attention mechanism module calculates an attention map from two dimensions of space and a channel;
the channel attention module performs pooling operation on the input convolution characteristic diagram F through a maximum pooling layer and an average pooling layer respectively, then inputs the convolution characteristic diagram F into the multi-layer sensor MLP, sums two characteristics output by the multi-layer sensor MLP according to elements, and then generates a channel attention characteristic diagram C through sigmoid activation;
multiplying the channel attention feature map C and the input convolution feature map F element by element to obtain a product S1, and inputting the product S1 into a space attention module;
performing pooling operation on the maximum pooling layer and the average pooling layer, splicing the two results based on channels, performing dimensionality reduction processing through convolution operation, activating through a sigmoid activation function to generate an activation feature map S2, and multiplying the product S1 and the activation feature map S2 by elements to generate a spatial attention feature map S.
7. The online unmanned aerial vehicle detection method of claim 6, further comprising:
constructing a transverse connection network, inputting the obtained attention feature map S into a dual-channel network to obtain a feature map F ', and obtaining the feature map F ' by convolving the feature map F ' with a convolution kernel with the size of 1 × 1*By making a convolution with the feature map F*And performing top-to-bottom up-sampling and transverse connection with a feature map F obtained by the last double-channel network convolution to combine high-level features with low resolution and rich feature semantic information with low-level features with high resolution and less feature semantic information and improve the performance of detecting and identifying the remote sensing image.
8. The online unmanned aerial vehicle detection method of claim 1, further comprising:
introducing a regional converter module into the RPN module to extract a target in a remote sensing image with any direction;
the area converter module comprises a learning module and a deformation module;
the learning module comprises a PS RoIAlign layer, a full connection layer and a decoder;
the full-link layer outputs the offset of the true value of the rotation relative to the horizontal candidate identification region, and the decoder outputs the horizontal candidate identification region and the offset (t)*) As input, the input horizontal candidate recognition area and the true value (x) of the rotation are inputted*,y*,w*,h*,θ*) Matching, outputting the decoded rotated candidate identification area (x, y, w, h, theta) through the following formula, and then taking the convolution feature map F and the rotated candidate identification area as input to be transmitted to a deformation module for feature extraction;
9. the online detection method for the unmanned aerial vehicle according to claim 1, wherein a suppression operator is added to the RPN module, and the design flow of the suppression operator is as follows:
the set B is all candidate frames, S is the score values of all candidate frames, N is a set threshold value, and D is an empty set for storing the screened candidate frames;
step 1, judging whether the set B is empty, if so, executing step 5, and if not, executing step 2;
step 2, sorting according to the scores of the candidate frames, adding the candidate frame with the highest score into the set D, recording the candidate frame as M, and removing M from the set B;
step 3, traversing all candidate frames Bi in the set B, calculating the attenuation values of Bi and M through an attenuation function F, and taking the obtained attenuation values as new score values Si of the candidate frames Bi;
step 4, repeating the step 1;
step 5, returning the candidate frame set D and the candidate frame score S;
step 6, removing the candidate frames with the scores smaller than N from the sets D and S;
the attenuation function F is formulated as:
ac is both Ka、KbMinimum area of two frames, U being two frames Ka、KbThe union of (a) and (b).
10. The online unmanned aerial vehicle detection method of claim 1, further comprising modifying the RoI Polling of the Fast R-CNN module to RoI Align, and solving the problem of region mismatch caused by the quantization operation of RoI Polling in the original Fast R-CNN network by using bilinear interpolation.
11. The unmanned aerial vehicle on-line detection method of claim 1, further comprising applying a modified neural network to the unmanned aerial vehicle;
the unmanned aerial vehicle comprises an unmanned aerial vehicle body, a processing device, a storage device and a small pixel camera, wherein the pixel size of the small pixel camera is 1-2 um;
shooting in real time by using a small pixel camera to obtain a high-resolution remote sensing image;
and the processing device calls a neural network which is stored in the storage device and trained by adopting the improved method to process the high-resolution remote sensing image shot by the small-pixel camera so as to realize online detection.
12. An unmanned aerial vehicle on-line measuring system which characterized in that includes:
the preprocessing module is configured to preprocess the input image through data random rotation enhancement and construction of a sub-convolution network, and replace a large convolution kernel with a plurality of small convolution kernels so as to prevent the loss of the extracted feature information when the parameter quantity is reduced;
the main network improvement module is configured to improve the main network of the Fast R-CNN module and construct a transverse connection dual-channel network added with an attention mechanism as the main network;
the first characteristic and the second characteristic are combined, and the remote sensing image detection and identification performance is improved on the basis of not increasing the calculation amount of the original model;
the first feature is a high-level feature with a first resolution and a first feature semantic information amount;
the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount;
the first resolution is lower than the second resolution, and the first characteristic semantic information quantity is larger than the second characteristic semantic information quantity;
an RPN module configured to detect a rotated target image using a region converter;
screening candidate frames by adopting an inhibition operator;
the classification and frame regression improvement module is configured to improve the classification and frame of the Fast R-CNN module, and the area mismatching caused by the RoI Polling quantization operation is prevented by using a bilinear interpolation method through the RoIAlign module;
and the target detection and identification module is configured to complete the combination of unmanned aerial vehicle remote sensing and a deep neural network so as to perform target detection and identification on the acquired image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110127611.2A CN112818840B (en) | 2021-01-29 | 2021-01-29 | Unmanned aerial vehicle online detection system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110127611.2A CN112818840B (en) | 2021-01-29 | 2021-01-29 | Unmanned aerial vehicle online detection system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112818840A true CN112818840A (en) | 2021-05-18 |
CN112818840B CN112818840B (en) | 2024-08-02 |
Family
ID=75860297
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110127611.2A Active CN112818840B (en) | 2021-01-29 | 2021-01-29 | Unmanned aerial vehicle online detection system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112818840B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113610038A (en) * | 2021-08-17 | 2021-11-05 | 北京计算机技术及应用研究所 | Vehicle-mounted pedestrian detection method integrating horizontal road surface area semantic information |
CN115018788A (en) * | 2022-06-02 | 2022-09-06 | 常州晋陵电力实业有限公司 | Overhead line abnormity detection method and system based on intelligent robot |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180107182A1 (en) * | 2016-10-13 | 2018-04-19 | Farrokh Mohamadi | Detection of drones |
CN108681692A (en) * | 2018-04-10 | 2018-10-19 | 华南理工大学 | Increase Building recognition method in a kind of remote sensing images based on deep learning newly |
US20180336431A1 (en) * | 2017-05-16 | 2018-11-22 | Nec Laboratories America, Inc. | Pruning filters for efficient convolutional neural networks for image recognition of environmental hazards |
CN110008953A (en) * | 2019-03-29 | 2019-07-12 | 华南理工大学 | Potential target Area generation method based on the fusion of convolutional neural networks multilayer feature |
CN110084195A (en) * | 2019-04-26 | 2019-08-02 | 西安电子科技大学 | Remote Sensing Target detection method based on convolutional neural networks |
US20190311203A1 (en) * | 2018-04-09 | 2019-10-10 | Accenture Global Solutions Limited | Aerial monitoring system and method for identifying and locating object features |
CN111091105A (en) * | 2019-12-23 | 2020-05-01 | 郑州轻工业大学 | Remote sensing image target detection method based on new frame regression loss function |
CN111191566A (en) * | 2019-12-26 | 2020-05-22 | 西北工业大学 | Optical remote sensing image multi-target detection method based on pixel classification |
CN111626993A (en) * | 2020-05-07 | 2020-09-04 | 武汉科技大学 | Image automatic detection counting method and system based on embedded FEFnet network |
CN111640125A (en) * | 2020-05-29 | 2020-09-08 | 广西大学 | Mask R-CNN-based aerial photograph building detection and segmentation method and device |
CN111666836A (en) * | 2020-05-22 | 2020-09-15 | 北京工业大学 | High-resolution remote sensing image target detection method of M-F-Y type lightweight convolutional neural network |
US20200320273A1 (en) * | 2017-12-26 | 2020-10-08 | Beijing Sensetime Technology Development Co., Ltd. | Remote sensing image recognition method and apparatus, storage medium and electronic device |
CN112069868A (en) * | 2020-06-28 | 2020-12-11 | 南京信息工程大学 | Unmanned aerial vehicle real-time vehicle detection method based on convolutional neural network |
-
2021
- 2021-01-29 CN CN202110127611.2A patent/CN112818840B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180107182A1 (en) * | 2016-10-13 | 2018-04-19 | Farrokh Mohamadi | Detection of drones |
US20180336431A1 (en) * | 2017-05-16 | 2018-11-22 | Nec Laboratories America, Inc. | Pruning filters for efficient convolutional neural networks for image recognition of environmental hazards |
US20200320273A1 (en) * | 2017-12-26 | 2020-10-08 | Beijing Sensetime Technology Development Co., Ltd. | Remote sensing image recognition method and apparatus, storage medium and electronic device |
US20190311203A1 (en) * | 2018-04-09 | 2019-10-10 | Accenture Global Solutions Limited | Aerial monitoring system and method for identifying and locating object features |
CN108681692A (en) * | 2018-04-10 | 2018-10-19 | 华南理工大学 | Increase Building recognition method in a kind of remote sensing images based on deep learning newly |
CN110008953A (en) * | 2019-03-29 | 2019-07-12 | 华南理工大学 | Potential target Area generation method based on the fusion of convolutional neural networks multilayer feature |
CN110084195A (en) * | 2019-04-26 | 2019-08-02 | 西安电子科技大学 | Remote Sensing Target detection method based on convolutional neural networks |
CN111091105A (en) * | 2019-12-23 | 2020-05-01 | 郑州轻工业大学 | Remote sensing image target detection method based on new frame regression loss function |
CN111191566A (en) * | 2019-12-26 | 2020-05-22 | 西北工业大学 | Optical remote sensing image multi-target detection method based on pixel classification |
CN111626993A (en) * | 2020-05-07 | 2020-09-04 | 武汉科技大学 | Image automatic detection counting method and system based on embedded FEFnet network |
CN111666836A (en) * | 2020-05-22 | 2020-09-15 | 北京工业大学 | High-resolution remote sensing image target detection method of M-F-Y type lightweight convolutional neural network |
CN111640125A (en) * | 2020-05-29 | 2020-09-08 | 广西大学 | Mask R-CNN-based aerial photograph building detection and segmentation method and device |
CN112069868A (en) * | 2020-06-28 | 2020-12-11 | 南京信息工程大学 | Unmanned aerial vehicle real-time vehicle detection method based on convolutional neural network |
Non-Patent Citations (3)
Title |
---|
张瑞倩;邵振峰;ALEKSEI PORTNOV;汪家明;: "多尺度空洞卷积的无人机影像目标检测方法", 武汉大学学报(信息科学版), no. 06 * |
李希;徐翔;李军;: "面向航空飞行安全的遥感图像小目标检测", 航空兵器, no. 03 * |
李希;徐翔;李军;: "面向航空飞行安全的遥感图像小目标检测", 航空兵器, no. 03, 15 June 2020 (2020-06-15) * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113610038A (en) * | 2021-08-17 | 2021-11-05 | 北京计算机技术及应用研究所 | Vehicle-mounted pedestrian detection method integrating horizontal road surface area semantic information |
CN115018788A (en) * | 2022-06-02 | 2022-09-06 | 常州晋陵电力实业有限公司 | Overhead line abnormity detection method and system based on intelligent robot |
CN115018788B (en) * | 2022-06-02 | 2023-11-14 | 常州晋陵电力实业有限公司 | Overhead line abnormality detection method and system based on intelligent robot |
Also Published As
Publication number | Publication date |
---|---|
CN112818840B (en) | 2024-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113065558B (en) | Lightweight small target detection method combined with attention mechanism | |
Yi et al. | An end‐to‐end steel strip surface defects recognition system based on convolutional neural networks | |
JP7490141B2 (en) | IMAGE DETECTION METHOD, MODEL TRAINING METHOD, IMAGE DETECTION APPARATUS, TRAINING APPARATUS, DEVICE, AND PROGRAM | |
Gerg et al. | Structural prior driven regularized deep learning for sonar image classification | |
CN111709313B (en) | Pedestrian re-identification method based on local and channel combination characteristics | |
CN112580458A (en) | Facial expression recognition method, device, equipment and storage medium | |
CN110826462A (en) | Human body behavior identification method of non-local double-current convolutional neural network model | |
CN113344110B (en) | Fuzzy image classification method based on super-resolution reconstruction | |
CN112528961A (en) | Video analysis method based on Jetson Nano | |
CN112818840A (en) | Unmanned aerial vehicle online detection system and method | |
CN116883933A (en) | Security inspection contraband detection method based on multi-scale attention and data enhancement | |
CN115880495A (en) | Ship image target detection method and system under complex environment | |
CN117351550A (en) | Grid self-attention facial expression recognition method based on supervised contrast learning | |
CN115861226A (en) | Method for intelligently identifying surface defects by using deep neural network based on characteristic value gradient change | |
CN110503157B (en) | Image steganalysis method of multitask convolution neural network based on fine-grained image | |
Patel et al. | A novel approach for semantic segmentation of automatic road network extractions from remote sensing images by modified UNet | |
Boby et al. | Improving licence plate detection using generative adversarial networks | |
CN112784836A (en) | Text and graphic offset angle prediction and correction method thereof | |
CN117710295A (en) | Image processing method, device, apparatus, medium, and program product | |
CN117314751A (en) | Remote sensing image super-resolution reconstruction method based on generation type countermeasure network | |
CN112465847A (en) | Edge detection method, device and equipment based on clear boundary prediction | |
CN112132207A (en) | Target detection neural network construction method based on multi-branch feature mapping | |
CN115546598A (en) | Depth forged image detection method and system based on frequency domain transformation | |
Wyzykowski et al. | A Universal Latent Fingerprint Enhancer Using Transformers | |
Jain et al. | Natural scene statistics and CNN based parallel network for image quality assessment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |