CN112818840A - Unmanned aerial vehicle online detection system and method - Google Patents

Unmanned aerial vehicle online detection system and method Download PDF

Info

Publication number
CN112818840A
CN112818840A CN202110127611.2A CN202110127611A CN112818840A CN 112818840 A CN112818840 A CN 112818840A CN 202110127611 A CN202110127611 A CN 202110127611A CN 112818840 A CN112818840 A CN 112818840A
Authority
CN
China
Prior art keywords
module
convolution
feature
network
aerial vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110127611.2A
Other languages
Chinese (zh)
Other versions
CN112818840B (en
Inventor
王子健
尹增山
谭政
高爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Engineering Center for Microsatellites
Innovation Academy for Microsatellites of CAS
Original Assignee
Shanghai Engineering Center for Microsatellites
Innovation Academy for Microsatellites of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Engineering Center for Microsatellites, Innovation Academy for Microsatellites of CAS filed Critical Shanghai Engineering Center for Microsatellites
Priority to CN202110127611.2A priority Critical patent/CN112818840B/en
Publication of CN112818840A publication Critical patent/CN112818840A/en
Application granted granted Critical
Publication of CN112818840B publication Critical patent/CN112818840B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an unmanned aerial vehicle online detection system and method, which comprises the following steps: preprocessing an input image by data random rotation enhancement and construction of a sub-convolution network, and replacing a large convolution kernel by a plurality of small convolution kernels so as to prevent the loss of extracted characteristic information when the parameter quantity is reduced; improving a main network of a Fast R-CNN module, constructing a transverse connection dual-channel network added with an attention mechanism as the main network of the Fast R-CNN module, combining the first characteristic with the second characteristic, and improving the detection and identification performance of the remote sensing image on the basis of not increasing the calculated amount of the original model; the first feature is a high-level feature with a first resolution and a first feature semantic information amount; the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount; detecting the rotated target image by using an area converter; screening candidate frames by adopting an inhibition operator; the method prevents the region mismatching caused by the RoIPolling quantization operation by using a bilinear interpolation method through a RoI Align module.

Description

Unmanned aerial vehicle online detection system and method
Technical Field
The invention relates to the technical field of remote sensing information, in particular to an unmanned aerial vehicle online detection system and method.
Background
The on-line target detection and identification of the remote sensing image is one of the difficult problems in the field of computer vision, and the detection and identification of the remote sensing target are difficult due to the characteristics of the remote sensing target, so that the following difficulties exist: firstly, because the data volume of the remote sensing image is huge, how to quickly and accurately detect the remote sensing target is a difficult problem; secondly, the remote sensing target is shielded by the cloud layer; thirdly, low resolution and blurred images. The information reflected by the remote sensing target from the image is less, the feature expression capability is weaker, and fewer features can be extracted in the feature extraction process, so that the detection and identification effect is poor, and the accuracy is low.
Disclosure of Invention
The invention aims to provide an unmanned aerial vehicle online detection system and method, and aims to solve the problems that the existing remote sensing image online target detection and identification is low in accuracy and poor in effect.
In order to solve the technical problem, the invention provides an online detection method for an unmanned aerial vehicle, which comprises the following steps:
preprocessing an input image by data random rotation enhancement and construction of a sub-convolution network, and replacing a large convolution kernel by a plurality of small convolution kernels so as to prevent the loss of extracted characteristic information when the parameter quantity is reduced;
the method comprises the following steps of improving a main network of a Fast R-CNN module, and constructing a transverse connection dual-channel network added with an attention mechanism as the main network, wherein the method comprises the following steps:
the first characteristic and the second characteristic are combined, and the remote sensing image detection and identification performance is improved on the basis of not increasing the calculation amount of the original model;
the first feature is a high-level feature with a first resolution and a first feature semantic information amount;
the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount;
the first resolution is lower than the second resolution, and the first characteristic semantic information quantity is larger than the second characteristic semantic information quantity;
detecting the rotated target image by using an area converter;
screening candidate frames by adopting an inhibition operator;
the method comprises the steps of preventing region mismatching caused by the RoI Polling quantization operation by using a bilinear interpolation method through a RoI Align module;
and finishing the combination of unmanned aerial vehicle remote sensing and a deep neural network so as to detect and identify the target of the acquired image.
Optionally, in the online detection method of the unmanned aerial vehicle, further comprising,
acquiring a remote sensing image, and performing data enhancement on the remote sensing image;
inputting the enhanced remote sensing image into a sub-convolution network for preprocessing to obtain a preprocessed characteristic diagram;
inputting the preprocessed feature map into a transverse connection dual-channel network added with an attention mechanism for feature extraction, and inputting the feature map obtained by convolution into an RPN module and a Fast R-CNN module;
generating a frame in an RPN module, converting the generated horizontal frame into a directional rotating frame through a regional converter, and further screening the generated frame through a suppression operator;
the selected region after screening is subjected to RoI Align processing in a Fast R-CNN module, and then classification and frame regression are carried out through the output of a full connection layer;
respectively carrying out reverse gradient propagation on the RPN module and the Fast R-CNN module according to the loss function result in the RPN module and the Fast R-CNN module, and adjusting the weight of network parameters in the training process;
and repeating the steps, continuously iterating the training process until the parameters of the network are converged, and embedding the trained neural network into a storage device of the unmanned aerial vehicle, so that the unmanned aerial vehicle can carry out online remote sensing target detection.
Optionally, in the online detection method of an unmanned aerial vehicle, the data random rotation enhancement includes:
performing random rotation enhancement on the categories of which the number of samples is less than the sample number threshold, wherein the rotation angles comprise 90 degrees, 180 degrees and 270 degrees;
a series of small blocks of 1024 x 1024 pixel size are cropped from the input image for training.
Optionally, in the online detection method of the unmanned aerial vehicle, constructing a sub-convolution network to pre-process the input image includes:
extracting image characteristic information in a convolution mode to replace a large convolution kernel with a plurality of small convolution kernels;
the sub-convolutional network comprises three convolutional layers with convolutional kernel size of 3 x 3, where the step size of two convolutional layers is 2 and the step size of the other convolutional layer is 1.
Optionally, in the online detection method of an unmanned aerial vehicle, the improving the backbone network includes:
constructing a dual-channel network:
the first path passes through an average pooling layer with convolution kernel size of 2 x 2 and step size of 2 and 1 convolution layer with convolution kernel size of 1 x 1, so as to avoid information loss;
the second path passes through a convolution layer with convolution kernel size of 1 x 1, then passes through 1 convolution layer with convolution kernel size of 3 x 3 and step size of 2, and finally passes through 1 convolution layer with convolution kernel size of 1 x 1;
and adding the results of the first path and the second path to obtain a convolution characteristic diagram F, and sending the convolution characteristic diagram F to the attention mechanism module.
Optionally, in the online detection method of an unmanned aerial vehicle, the method further includes: the attention mechanism module calculates an attention map from two dimensions of space and a channel;
the channel attention module performs pooling operation on the input convolution characteristic diagram F through a maximum pooling layer and an average pooling layer respectively, then inputs the convolution characteristic diagram F into the multi-layer sensor MLP, sums two characteristics output by the multi-layer sensor MLP according to elements, and then generates a channel attention characteristic diagram C through sigmoid activation;
multiplying the channel attention feature map C and the input convolution feature map F element by element to obtain a product S1, and inputting the product S1 into a space attention module;
performing pooling operation on the maximum pooling layer and the average pooling layer, splicing the two results based on channels, performing dimensionality reduction processing through convolution operation, activating through a sigmoid activation function to generate an activation feature map S2, and multiplying the product S1 and the activation feature map S2 by elements to generate a spatial attention feature map S.
Optionally, in the online detection method of an unmanned aerial vehicle, the method further includes: constructing a transverse connection network, inputting the obtained attention feature map S into a dual-channel network to obtain a feature map F ', and obtaining the feature map F ' by convolving the feature map F ' with a convolution kernel with the size of 1 × 1*By making a convolution with the feature map F*And performing top-to-bottom up-sampling and transverse connection with a feature map F obtained by the last double-channel network convolution to combine high-level features with low resolution and rich feature semantic information with low-level features with high resolution and less feature semantic information and improve the performance of detecting and identifying the remote sensing image.
Optionally, in the online detection method of an unmanned aerial vehicle, the method further includes:
introducing a regional converter module into the RPN module to extract a target in a remote sensing image with any direction;
the area converter module comprises a learning module and a deformation module;
the learning module comprises a PS RoI Align layer, a full connection layer and a decoder;
the full-link layer outputs the rotated true value relative to the horizontal candidate identification region and the offset, and the decoder outputs the horizontal candidate identification region and the offset (t)*) As input, the input horizontal candidate recognition area and the true value (x) of the rotation are inputted*,y*,w*,h**) Matching is performed, and the decoded rotated recognition candidate region (x, y, w, h, θ) is output by the following formula, howeverThen, the convolution feature map F and the rotated candidate identification area are used as input and transmitted to a deformation module for feature extraction;
Figure BDA0002923989490000041
Figure BDA0002923989490000042
optionally, in the online detection method for the unmanned aerial vehicle, an inhibition operator is added to the RPN module, and the design flow of the inhibition operator is as follows:
the set B is all candidate frames, S is the score values of all candidate frames, N is a set threshold value, and D is an empty set for storing the screened candidate frames;
step 1, judging whether the set B is empty, if so, executing step 5, and if not, executing step 2;
step 2, sorting according to the scores of the candidate frames, adding the candidate frame with the highest score into the set D, recording the candidate frame as M, and removing M from the set B;
step 3, traversing all candidate frames Bi in the set B, calculating the attenuation values of Bi and M through an attenuation function F, and taking the obtained attenuation values as new score values Si of the candidate frames Bi;
step 4, repeating the step 1;
step 5, returning the candidate frame set D and the candidate frame score S;
step 6, removing the candidate frames with the scores smaller than N from the sets D and S;
the attenuation function F is formulated as:
Figure BDA0002923989490000051
Figure BDA0002923989490000052
ac is both Ka、KbMinimum area of two frames, U being two frames Ka、KbThe union of (a) and (b).
Optionally, in the online detection method of an unmanned aerial vehicle, the method further includes: and modifying the RoI Polling of the Fast R-CNN module into RoI Align, and solving the problem of region mismatching caused by the quantification operation of the RoI Polling in the original Fast R-CNN network by using a bilinear interpolation method.
Optionally, in the online detection method for the unmanned aerial vehicle, the method further includes applying an improved neural network to the unmanned aerial vehicle;
the unmanned aerial vehicle comprises an unmanned aerial vehicle body, a processing device, a storage device and a small pixel camera, wherein the pixel size of the small pixel camera is 1-2 um;
shooting in real time by using a small pixel camera to obtain a high-resolution remote sensing image;
and the processing device calls a neural network which is stored in the storage device and trained by adopting the improved method to process the high-resolution remote sensing image shot by the small-pixel camera so as to realize online detection.
The invention also provides an online detection system for the unmanned aerial vehicle, which comprises:
the preprocessing module is configured to preprocess the input image through data random rotation enhancement and construction of a sub-convolution network, and replace a large convolution kernel with a plurality of small convolution kernels so as to prevent the loss of the extracted feature information when the parameter quantity is reduced;
the main network improvement module is configured to improve the main network of the Fast R-CNN module and construct a transverse connection dual-channel network added with an attention mechanism as the main network;
the first characteristic and the second characteristic are combined, and the remote sensing image detection and identification performance is improved on the basis of not increasing the calculation amount of the original model;
the first feature is a high-level feature with a first resolution and a first feature semantic information amount;
the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount;
the first resolution is lower than the second resolution, and the first characteristic semantic information quantity is larger than the second characteristic semantic information quantity;
an RPN module configured to detect a rotated target image using a region converter;
screening candidate frames by adopting an inhibition operator;
the classification and frame regression improvement module is configured to improve the classification and frame of the Fast R-CNN module, and the region mismatching caused by the RoI Polling quantization operation is prevented by using a bilinear interpolation method through the RoI Align module;
and the target detection and identification module is configured to complete the combination of unmanned aerial vehicle remote sensing and a deep neural network so as to perform target detection and identification on the acquired image.
The inventor of the present invention has found through research that the early conventional method for target detection in remote sensing images follows a two-stage detection paradigm: 1) candidate extraction 2) target verification. In the candidate extraction stage, some common methods are a gray value filtering-based method, a wavelet transform-based method, an anomaly detection-based method, a visual saliency-based method, and the like. In the target verification stage, some common methods are HOG, LBP, SIFT, etc. With the development of computer vision technology, deep neural network technology is also applied to remote sensing target detection. In 2016, FastyUncaria Nakamurami et al proposed a target detection algorithm for fast R-CNN (Ren S, He K, Girshick R, et al. fast R-CNN: Towards real-time object detection with region pro-position networks [ J ]. IEEE transactions on pattern analysis and machine analysis, 2016,39(6):1137-1149.) and used convolutional neural network for candidate region generation, which greatly improved the computational efficiency. The Faster R-CNN network can be regarded as an RPN module and a Fast R-CNN module, wherein the RPN module is responsible for generating a target candidate region, and the Fast R-CNN module is responsible for learning the characteristics of the candidate region, classifying the candidate region and regressing a frame. However, there are some problems with the fast R-CNN network: the method mainly aims at a conventional target, and when the remote sensing target is detected, certain compression can be carried out on a characteristic diagram by pooling operation in a characteristic extraction network, so that some information in the characteristic diagram is filtered. Also, unlike conventional images taken from a horizontal angle, the remote sensing image is typically taken from a bird's eye perspective, resulting in an arbitrary direction of the target in the remote sensing image. In addition, the complex background and the changed appearance of the target further increase the difficulty of target detection in the remote sensing image, so that the Faster R-CNN network has poor effect in detecting and identifying the remote sensing target.
Based on the above insights, the invention provides an unmanned aerial vehicle online detection system and method, which combine unmanned aerial vehicle remote sensing and deep neural network technologies to perform target detection and identification on the acquired image. The method comprises the following steps: the data is randomly rotated and enhanced, a sub-convolution network is constructed to preprocess the input image, a plurality of small convolution kernels are used for replacing a large convolution kernel, and the extracted characteristic information is not lost while the parameter quantity is reduced; the method comprises the steps of improving a backbone network, constructing a transverse connection dual-channel network added with an attention mechanism, combining high-level features with low resolution and rich feature semantic information with low-level features with high resolution and less feature semantic information, and improving the performance of detecting and identifying the remote sensing image on the basis of basically not increasing the calculated amount of an original model; a regional converter is introduced, so that a rotated target image can be better detected; proposing an inhibition operator to better screen the candidate frame; and introducing RoI Align, and solving the problem of region mismatching caused by the quantization operation of RoI Polling by using a bilinear interpolation method.
Drawings
Fig. 1 is a schematic diagram of an online detection method for an unmanned aerial vehicle according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a spatial, channel two-dimensional computational attention map spectrum according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a transverse-connection dual-path network according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a local area converter module according to an embodiment of the invention.
Detailed Description
The unmanned aerial vehicle online detection system and method provided by the invention are further described in detail below with reference to the accompanying drawings and specific embodiments. Advantages and features of the present invention will become apparent from the following description and from the claims. It is to be noted that the drawings are in a very simplified form and are not to precise scale, which is merely for the purpose of facilitating and distinctly claiming the embodiments of the present invention.
Furthermore, features from different embodiments of the invention may be combined with each other, unless otherwise indicated. For example, a feature of the second embodiment may be substituted for a corresponding or functionally equivalent or similar feature of the first embodiment, and the resulting embodiments are likewise within the scope of the disclosure or recitation of the present application.
The core idea of the invention is to provide an unmanned aerial vehicle online detection system and method, so as to solve the problems of low accuracy and poor effect of the existing remote sensing image online target detection and identification.
In order to realize the above idea, the present invention provides an online detection system and method for an unmanned aerial vehicle, as shown in fig. 1, including: preprocessing an input image by data random rotation enhancement and construction of a sub-convolution network, and replacing a large convolution kernel by a plurality of small convolution kernels so as to prevent the loss of extracted characteristic information when the parameter quantity is reduced; the main network of the Fast R-CNN module is improved, a transverse connection dual-channel network added with an attention mechanism is constructed to serve as the main network of the Fast R-CNN module so as to combine the first characteristic with the second characteristic, and the detection and identification performance of the remote sensing image is improved on the basis of not increasing the calculated amount of an original model; the first feature is a high-level feature with a first resolution and a first feature semantic information amount; the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount; the first resolution is lower than the second resolution, and the first characteristic semantic information quantity is larger than the second characteristic semantic information quantity; detecting the rotated target image by using an area converter; screening candidate frames by adopting an inhibition operator; the method comprises the steps of preventing region mismatching caused by the RoI Polling quantization operation by using a bilinear interpolation method through a RoI Align module; and finishing the combination of unmanned aerial vehicle remote sensing and a deep neural network so as to detect and identify the target of the acquired image.
And data enhancement, namely performing random rotation enhancement on the categories with small number of samples, and rotating the categories by 90 degrees, 180 degrees and 270 degrees respectively. A series of small blocks of 1024 x 1024 pixel size are then cropped from the image for training.
A sub-convolution network is constructed to preprocess an input image, image characteristic information is extracted in a convolution mode, a plurality of small convolution kernels are used for replacing large convolution kernels, and the extracted characteristic information is guaranteed not to be lost while parameters are reduced. The sub-convolutional network consists of three convolutional layers with convolutional kernel size of 3 x 3, the step size of the first two convolutional layers is 2, and the step size of the last convolutional layer is 1.
Then, the main network of the Fast R-CNN module is improved, and a transverse connection dual-channel network (figure 3) added with an attention mechanism is constructed to serve as the main network. Firstly, jump connection is carried out through double paths, so that the problem of gradient disappearance is solved while the network is deepened. The characteristic diagram is convoluted through two paths, and the result values of the two paths of convolution are added. The path a (first path) passes through an average pooling layer with a convolution kernel size of 2 x 2 and step size of 2 and 1 convolution layer with a convolution kernel size of 1 x 1, in such a way that information loss is avoided. The path B (second path) passes through 1 convolutional layer with convolution kernel size 1 x 1, then through 1 convolutional layer with convolution kernel size 3 x 3 and step size 2, and finally through 1 convolutional layer with convolution kernel size 1 x 1. And adding the two results, and sending the obtained characteristic diagram into an attention mechanism module.
The attention mechanism module calculates an attention map from two dimensions, spatial, channel (fig. 2).
And the channel attention module performs pooling operation on the input feature map F through a maximum pooling layer and an average pooling layer respectively, inputs the feature map F into the multi-layer sensor MLP, sums the two features output by the MLP according to elements, and generates a channel attention feature map C through sigmoid activation.
Multiplying C and the input feature map F element by element to obtain S1, and inputting S1 into a spatial attention module. Performing pooling operation on the maximum pooling layer and the average pooling layer, splicing the two results based on channels, performing dimensionality reduction processing through convolution operation, activating through a sigmoid activation function to generate a feature map S2, and multiplying S1 and S2 by elements to generate a spatial attention feature map S.
And constructing a transverse connection network, and combining high-level features with low resolution and rich feature semantic information with low-level features with high resolution and less feature semantic information by performing top-down sampling on the feature graph generated by convolution and performing transverse connection on the feature graph generated by convolution, so that the performance of detecting and identifying the remote sensing image is improved.
A regional converter module (figure 4) is introduced into the RPN module to better extract a target in a remote sensing image with any direction, and the regional converter module is divided into a learning module and a deformation module. The learning module is composed of a PS RoI Align layer, a fully-connected layer and a decoder. The full-link layer outputs the offset of the true value of the rotation relative to the horizontal candidate identification region, and the decoder outputs the horizontal candidate identification region and the offset (t)*) As input, the input horizontal candidate recognition area and the true value (x) of the rotation are inputted*,y*,w*,h**) And matching is carried out, the decoded rotated candidate identification region (x, y, w, h and theta) is output through the following formula, and the feature map and the rotated candidate identification region are taken as input to be transmitted to a deformation module for feature extraction.
Figure BDA0002923989490000091
Figure BDA0002923989490000092
Adding a suppression operator in the RPN module, wherein the operator design flow is as follows:
the set B is all candidate frames, S is the score value of all candidate frames, and N is a set threshold. D is an empty set for holding the filtered candidate boxes.
And step 1, judging whether the set B is empty, if so, executing step 5, and if not, executing step 2.
And 2, sorting according to the scores of the candidate frames, adding the candidate frame with the highest score into the set D, recording the candidate frame as M, and removing M from the set B.
And 3, traversing all the candidate frames Bi in the set B, calculating the attenuation values of Bi and M through an attenuation function F, and taking the obtained attenuation values as new score values Si of the candidate frames Bi.
And 4, repeating the step 1.
And 5, returning the candidate frame set D and the candidate frame score S.
And 6, removing the candidate frames with the scores smaller than N from the sets D and S.
The attenuation function F is formulated as:
Figure BDA0002923989490000101
Figure BDA0002923989490000102
ac is also Ka、KbMinimum area of two frames, U being two frames Ka、KbThe union of (a) and (b).
And modifying the RoI Polling of the Fast R-CNN module into RoI Align, and solving the problem of region mismatching caused by the quantification operation of the RoI Polling in the original Fast R-CNN network by using a bilinear interpolation method.
On being applied to unmanned aerial vehicle with modified neural network, unmanned aerial vehicle includes unmanned aerial vehicle body, processing apparatus, storage device and little pixel camera, and the so-called little pixel is that the size is 1um-2 um. The small-pixel camera is used for shooting in real time to obtain a high-resolution remote sensing image, the obtained remote sensing image is sent to the processing device, and the processing device calls the neural network which is stored in the storage device and trained by the improved method to process the high-resolution remote sensing image shot by the small-pixel camera so as to realize online detection.
The whole process is as follows:
the method comprises the following steps: and acquiring a remote sensing image, and performing data enhancement on the remote sensing image.
Step two: and inputting the enhanced remote sensing image into a sub-convolution network for preprocessing to obtain a preprocessed characteristic diagram.
Step three: inputting the preprocessed feature map into a transverse connection dual-channel network added with an attention mechanism for feature extraction, and inputting the obtained feature map into an RPN module and a Fast R-CNN module.
And fourthly, generating a frame in the RPN module, converting the generated horizontal frame into a directional rotating frame through a regional converter, and further screening the generated frame through a suppression operator.
Step five: the selected region after screening is subjected to RoI Align processing in a Fast R-CNN module, and then classification and frame regression are carried out through the output of a full connection layer;
step six: and respectively carrying out reverse gradient propagation on the RPN module and the Fast R-CNN module according to the loss function result in the RPN module and the Fast R-CNN module, and adjusting the network parameter weight in the training process. And repeating the first step to the sixth step, continuously iterating the training process until the parameters of the network are converged, and embedding the trained neural network into a storage device of the unmanned aerial vehicle, so that the unmanned aerial vehicle can carry out online remote sensing target detection.
In summary, the above embodiments describe in detail different configurations of the online detection system and method for the unmanned aerial vehicle, and of course, the present invention includes, but is not limited to, the configurations listed in the above embodiments, and any content that is transformed based on the configurations provided in the above embodiments falls within the scope of the present invention. One skilled in the art can take the contents of the above embodiments to take a counter-measure.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The above description is only for the purpose of describing the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention, and any variations and modifications made by those skilled in the art based on the above disclosure are within the scope of the appended claims.

Claims (12)

1. An unmanned aerial vehicle online detection method is characterized by comprising the following steps:
preprocessing an input image by data random rotation enhancement and construction of a sub-convolution network, and replacing a large convolution kernel by a plurality of small convolution kernels so as to prevent the loss of extracted characteristic information when the parameter quantity is reduced;
the method comprises the following steps of improving a main network of a Fast R-CNN module, and constructing a transverse connection dual-channel network added with an attention mechanism as the main network, wherein the method comprises the following steps:
the first characteristic and the second characteristic are combined, and the remote sensing image detection and identification performance is improved on the basis of not increasing the calculation amount of the original model;
the first feature is a high-level feature with a first resolution and a first feature semantic information amount;
the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount;
the first resolution is lower than the second resolution, and the first characteristic semantic information quantity is larger than the second characteristic semantic information quantity;
detecting the rotated target image by using an area converter;
screening candidate frames by adopting an inhibition operator;
the method comprises the steps that a bilinear interpolation method is used by a RoIAlign module to prevent region mismatching caused by RoI Polling quantization operation;
and finishing the combination of unmanned aerial vehicle remote sensing and a deep neural network so as to detect and identify the target of the acquired image.
2. The unmanned aerial vehicle online detection method of claim 1, further comprising,
acquiring a remote sensing image, and performing data enhancement on the remote sensing image;
inputting the enhanced remote sensing image into a sub-convolution network for preprocessing to obtain a preprocessed characteristic diagram;
inputting the preprocessed feature map into a transverse connection dual-channel network added with an attention mechanism for feature extraction, and inputting the feature map obtained by convolution into an RPN module and a Fast R-CNN module;
generating a frame in an RPN module, converting the generated horizontal frame into a directional rotating frame through a regional converter, and further screening the generated frame through a suppression operator;
the selected region after screening is subjected to RoI Align processing in a Fast R-CNN module, and then classification and frame regression are carried out through the output of a full connection layer;
respectively carrying out reverse gradient propagation on the RPN module and the Fast R-CNN module according to the loss function result in the RPN module and the Fast R-CNN module, and adjusting the weight of network parameters in the training process;
and repeating the steps, continuously iterating the training process until the parameters of the network are converged, and embedding the trained neural network into a storage device of the unmanned aerial vehicle, so that the unmanned aerial vehicle can carry out online remote sensing target detection.
3. The unmanned aerial vehicle online detection method of claim 1, wherein the data random rotation enhancement comprises:
performing random rotation enhancement on the categories of which the number of samples is less than the sample number threshold, wherein the rotation angles comprise 90 degrees, 180 degrees and 270 degrees;
a series of small blocks of 1024 x 1024 pixel size are cropped from the input image for training.
4. The unmanned aerial vehicle online detection method of claim 1, wherein constructing a sub-convolutional network to preprocess the input image comprises:
extracting image characteristic information in a convolution mode to replace a large convolution kernel with a plurality of small convolution kernels;
the sub-convolutional network comprises three convolutional layers with convolutional kernel size of 3 x 3, where the step size of two convolutional layers is 2 and the step size of the other convolutional layer is 1.
5. The unmanned aerial vehicle online detection method of claim 4, wherein improving the backbone network comprises:
constructing a dual-channel network:
the first path passes through an average pooling layer with convolution kernel size of 2 x 2 and step size of 2 and 1 convolution layer with convolution kernel size of 1 x 1, so as to avoid information loss;
the second path passes through a convolution layer with convolution kernel size of 1 x 1, then passes through 1 convolution layer with convolution kernel size of 3 x 3 and step size of 2, and finally passes through 1 convolution layer with convolution kernel size of 1 x 1;
and adding the results of the first path and the second path to obtain a convolution characteristic diagram F, and sending the convolution characteristic diagram F to the attention mechanism module.
6. The unmanned aerial vehicle online detection method of claim 5, further comprising: the attention mechanism module calculates an attention map from two dimensions of space and a channel;
the channel attention module performs pooling operation on the input convolution characteristic diagram F through a maximum pooling layer and an average pooling layer respectively, then inputs the convolution characteristic diagram F into the multi-layer sensor MLP, sums two characteristics output by the multi-layer sensor MLP according to elements, and then generates a channel attention characteristic diagram C through sigmoid activation;
multiplying the channel attention feature map C and the input convolution feature map F element by element to obtain a product S1, and inputting the product S1 into a space attention module;
performing pooling operation on the maximum pooling layer and the average pooling layer, splicing the two results based on channels, performing dimensionality reduction processing through convolution operation, activating through a sigmoid activation function to generate an activation feature map S2, and multiplying the product S1 and the activation feature map S2 by elements to generate a spatial attention feature map S.
7. The online unmanned aerial vehicle detection method of claim 6, further comprising:
constructing a transverse connection network, inputting the obtained attention feature map S into a dual-channel network to obtain a feature map F ', and obtaining the feature map F ' by convolving the feature map F ' with a convolution kernel with the size of 1 × 1*By making a convolution with the feature map F*And performing top-to-bottom up-sampling and transverse connection with a feature map F obtained by the last double-channel network convolution to combine high-level features with low resolution and rich feature semantic information with low-level features with high resolution and less feature semantic information and improve the performance of detecting and identifying the remote sensing image.
8. The online unmanned aerial vehicle detection method of claim 1, further comprising:
introducing a regional converter module into the RPN module to extract a target in a remote sensing image with any direction;
the area converter module comprises a learning module and a deformation module;
the learning module comprises a PS RoIAlign layer, a full connection layer and a decoder;
the full-link layer outputs the offset of the true value of the rotation relative to the horizontal candidate identification region, and the decoder outputs the horizontal candidate identification region and the offset (t)*) As input, the input horizontal candidate recognition area and the true value (x) of the rotation are inputted*,y*,w*,h**) Matching, outputting the decoded rotated candidate identification area (x, y, w, h, theta) through the following formula, and then taking the convolution feature map F and the rotated candidate identification area as input to be transmitted to a deformation module for feature extraction;
Figure FDA0002923989480000031
Figure FDA0002923989480000041
9. the online detection method for the unmanned aerial vehicle according to claim 1, wherein a suppression operator is added to the RPN module, and the design flow of the suppression operator is as follows:
the set B is all candidate frames, S is the score values of all candidate frames, N is a set threshold value, and D is an empty set for storing the screened candidate frames;
step 1, judging whether the set B is empty, if so, executing step 5, and if not, executing step 2;
step 2, sorting according to the scores of the candidate frames, adding the candidate frame with the highest score into the set D, recording the candidate frame as M, and removing M from the set B;
step 3, traversing all candidate frames Bi in the set B, calculating the attenuation values of Bi and M through an attenuation function F, and taking the obtained attenuation values as new score values Si of the candidate frames Bi;
step 4, repeating the step 1;
step 5, returning the candidate frame set D and the candidate frame score S;
step 6, removing the candidate frames with the scores smaller than N from the sets D and S;
the attenuation function F is formulated as:
Figure FDA0002923989480000042
Figure FDA0002923989480000043
ac is both Ka、KbMinimum area of two frames, U being two frames Ka、KbThe union of (a) and (b).
10. The online unmanned aerial vehicle detection method of claim 1, further comprising modifying the RoI Polling of the Fast R-CNN module to RoI Align, and solving the problem of region mismatch caused by the quantization operation of RoI Polling in the original Fast R-CNN network by using bilinear interpolation.
11. The unmanned aerial vehicle on-line detection method of claim 1, further comprising applying a modified neural network to the unmanned aerial vehicle;
the unmanned aerial vehicle comprises an unmanned aerial vehicle body, a processing device, a storage device and a small pixel camera, wherein the pixel size of the small pixel camera is 1-2 um;
shooting in real time by using a small pixel camera to obtain a high-resolution remote sensing image;
and the processing device calls a neural network which is stored in the storage device and trained by adopting the improved method to process the high-resolution remote sensing image shot by the small-pixel camera so as to realize online detection.
12. An unmanned aerial vehicle on-line measuring system which characterized in that includes:
the preprocessing module is configured to preprocess the input image through data random rotation enhancement and construction of a sub-convolution network, and replace a large convolution kernel with a plurality of small convolution kernels so as to prevent the loss of the extracted feature information when the parameter quantity is reduced;
the main network improvement module is configured to improve the main network of the Fast R-CNN module and construct a transverse connection dual-channel network added with an attention mechanism as the main network;
the first characteristic and the second characteristic are combined, and the remote sensing image detection and identification performance is improved on the basis of not increasing the calculation amount of the original model;
the first feature is a high-level feature with a first resolution and a first feature semantic information amount;
the second feature is a bottom-layer feature with a second resolution and a second feature semantic information amount;
the first resolution is lower than the second resolution, and the first characteristic semantic information quantity is larger than the second characteristic semantic information quantity;
an RPN module configured to detect a rotated target image using a region converter;
screening candidate frames by adopting an inhibition operator;
the classification and frame regression improvement module is configured to improve the classification and frame of the Fast R-CNN module, and the area mismatching caused by the RoI Polling quantization operation is prevented by using a bilinear interpolation method through the RoIAlign module;
and the target detection and identification module is configured to complete the combination of unmanned aerial vehicle remote sensing and a deep neural network so as to perform target detection and identification on the acquired image.
CN202110127611.2A 2021-01-29 2021-01-29 Unmanned aerial vehicle online detection system and method Active CN112818840B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110127611.2A CN112818840B (en) 2021-01-29 2021-01-29 Unmanned aerial vehicle online detection system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110127611.2A CN112818840B (en) 2021-01-29 2021-01-29 Unmanned aerial vehicle online detection system and method

Publications (2)

Publication Number Publication Date
CN112818840A true CN112818840A (en) 2021-05-18
CN112818840B CN112818840B (en) 2024-08-02

Family

ID=75860297

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110127611.2A Active CN112818840B (en) 2021-01-29 2021-01-29 Unmanned aerial vehicle online detection system and method

Country Status (1)

Country Link
CN (1) CN112818840B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610038A (en) * 2021-08-17 2021-11-05 北京计算机技术及应用研究所 Vehicle-mounted pedestrian detection method integrating horizontal road surface area semantic information
CN115018788A (en) * 2022-06-02 2022-09-06 常州晋陵电力实业有限公司 Overhead line abnormity detection method and system based on intelligent robot

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180107182A1 (en) * 2016-10-13 2018-04-19 Farrokh Mohamadi Detection of drones
CN108681692A (en) * 2018-04-10 2018-10-19 华南理工大学 Increase Building recognition method in a kind of remote sensing images based on deep learning newly
US20180336431A1 (en) * 2017-05-16 2018-11-22 Nec Laboratories America, Inc. Pruning filters for efficient convolutional neural networks for image recognition of environmental hazards
CN110008953A (en) * 2019-03-29 2019-07-12 华南理工大学 Potential target Area generation method based on the fusion of convolutional neural networks multilayer feature
CN110084195A (en) * 2019-04-26 2019-08-02 西安电子科技大学 Remote Sensing Target detection method based on convolutional neural networks
US20190311203A1 (en) * 2018-04-09 2019-10-10 Accenture Global Solutions Limited Aerial monitoring system and method for identifying and locating object features
CN111091105A (en) * 2019-12-23 2020-05-01 郑州轻工业大学 Remote sensing image target detection method based on new frame regression loss function
CN111191566A (en) * 2019-12-26 2020-05-22 西北工业大学 Optical remote sensing image multi-target detection method based on pixel classification
CN111626993A (en) * 2020-05-07 2020-09-04 武汉科技大学 Image automatic detection counting method and system based on embedded FEFnet network
CN111640125A (en) * 2020-05-29 2020-09-08 广西大学 Mask R-CNN-based aerial photograph building detection and segmentation method and device
CN111666836A (en) * 2020-05-22 2020-09-15 北京工业大学 High-resolution remote sensing image target detection method of M-F-Y type lightweight convolutional neural network
US20200320273A1 (en) * 2017-12-26 2020-10-08 Beijing Sensetime Technology Development Co., Ltd. Remote sensing image recognition method and apparatus, storage medium and electronic device
CN112069868A (en) * 2020-06-28 2020-12-11 南京信息工程大学 Unmanned aerial vehicle real-time vehicle detection method based on convolutional neural network

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180107182A1 (en) * 2016-10-13 2018-04-19 Farrokh Mohamadi Detection of drones
US20180336431A1 (en) * 2017-05-16 2018-11-22 Nec Laboratories America, Inc. Pruning filters for efficient convolutional neural networks for image recognition of environmental hazards
US20200320273A1 (en) * 2017-12-26 2020-10-08 Beijing Sensetime Technology Development Co., Ltd. Remote sensing image recognition method and apparatus, storage medium and electronic device
US20190311203A1 (en) * 2018-04-09 2019-10-10 Accenture Global Solutions Limited Aerial monitoring system and method for identifying and locating object features
CN108681692A (en) * 2018-04-10 2018-10-19 华南理工大学 Increase Building recognition method in a kind of remote sensing images based on deep learning newly
CN110008953A (en) * 2019-03-29 2019-07-12 华南理工大学 Potential target Area generation method based on the fusion of convolutional neural networks multilayer feature
CN110084195A (en) * 2019-04-26 2019-08-02 西安电子科技大学 Remote Sensing Target detection method based on convolutional neural networks
CN111091105A (en) * 2019-12-23 2020-05-01 郑州轻工业大学 Remote sensing image target detection method based on new frame regression loss function
CN111191566A (en) * 2019-12-26 2020-05-22 西北工业大学 Optical remote sensing image multi-target detection method based on pixel classification
CN111626993A (en) * 2020-05-07 2020-09-04 武汉科技大学 Image automatic detection counting method and system based on embedded FEFnet network
CN111666836A (en) * 2020-05-22 2020-09-15 北京工业大学 High-resolution remote sensing image target detection method of M-F-Y type lightweight convolutional neural network
CN111640125A (en) * 2020-05-29 2020-09-08 广西大学 Mask R-CNN-based aerial photograph building detection and segmentation method and device
CN112069868A (en) * 2020-06-28 2020-12-11 南京信息工程大学 Unmanned aerial vehicle real-time vehicle detection method based on convolutional neural network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张瑞倩;邵振峰;ALEKSEI PORTNOV;汪家明;: "多尺度空洞卷积的无人机影像目标检测方法", 武汉大学学报(信息科学版), no. 06 *
李希;徐翔;李军;: "面向航空飞行安全的遥感图像小目标检测", 航空兵器, no. 03 *
李希;徐翔;李军;: "面向航空飞行安全的遥感图像小目标检测", 航空兵器, no. 03, 15 June 2020 (2020-06-15) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610038A (en) * 2021-08-17 2021-11-05 北京计算机技术及应用研究所 Vehicle-mounted pedestrian detection method integrating horizontal road surface area semantic information
CN115018788A (en) * 2022-06-02 2022-09-06 常州晋陵电力实业有限公司 Overhead line abnormity detection method and system based on intelligent robot
CN115018788B (en) * 2022-06-02 2023-11-14 常州晋陵电力实业有限公司 Overhead line abnormality detection method and system based on intelligent robot

Also Published As

Publication number Publication date
CN112818840B (en) 2024-08-02

Similar Documents

Publication Publication Date Title
CN113065558B (en) Lightweight small target detection method combined with attention mechanism
Yi et al. An end‐to‐end steel strip surface defects recognition system based on convolutional neural networks
JP7490141B2 (en) IMAGE DETECTION METHOD, MODEL TRAINING METHOD, IMAGE DETECTION APPARATUS, TRAINING APPARATUS, DEVICE, AND PROGRAM
Gerg et al. Structural prior driven regularized deep learning for sonar image classification
CN111709313B (en) Pedestrian re-identification method based on local and channel combination characteristics
CN112580458A (en) Facial expression recognition method, device, equipment and storage medium
CN110826462A (en) Human body behavior identification method of non-local double-current convolutional neural network model
CN113344110B (en) Fuzzy image classification method based on super-resolution reconstruction
CN112528961A (en) Video analysis method based on Jetson Nano
CN112818840A (en) Unmanned aerial vehicle online detection system and method
CN116883933A (en) Security inspection contraband detection method based on multi-scale attention and data enhancement
CN115880495A (en) Ship image target detection method and system under complex environment
CN117351550A (en) Grid self-attention facial expression recognition method based on supervised contrast learning
CN115861226A (en) Method for intelligently identifying surface defects by using deep neural network based on characteristic value gradient change
CN110503157B (en) Image steganalysis method of multitask convolution neural network based on fine-grained image
Patel et al. A novel approach for semantic segmentation of automatic road network extractions from remote sensing images by modified UNet
Boby et al. Improving licence plate detection using generative adversarial networks
CN112784836A (en) Text and graphic offset angle prediction and correction method thereof
CN117710295A (en) Image processing method, device, apparatus, medium, and program product
CN117314751A (en) Remote sensing image super-resolution reconstruction method based on generation type countermeasure network
CN112465847A (en) Edge detection method, device and equipment based on clear boundary prediction
CN112132207A (en) Target detection neural network construction method based on multi-branch feature mapping
CN115546598A (en) Depth forged image detection method and system based on frequency domain transformation
Wyzykowski et al. A Universal Latent Fingerprint Enhancer Using Transformers
Jain et al. Natural scene statistics and CNN based parallel network for image quality assessment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant