CN111062310A - Few-sample unmanned aerial vehicle image identification method based on virtual sample generation - Google Patents

Few-sample unmanned aerial vehicle image identification method based on virtual sample generation Download PDF

Info

Publication number
CN111062310A
CN111062310A CN201911280878.4A CN201911280878A CN111062310A CN 111062310 A CN111062310 A CN 111062310A CN 201911280878 A CN201911280878 A CN 201911280878A CN 111062310 A CN111062310 A CN 111062310A
Authority
CN
China
Prior art keywords
layer
virtual
sample
unmanned aerial
aerial vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911280878.4A
Other languages
Chinese (zh)
Other versions
CN111062310B (en
Inventor
杨志钢
李辉洋
黎明
王军亮
胡家欣
孙鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN201911280878.4A priority Critical patent/CN111062310B/en
Publication of CN111062310A publication Critical patent/CN111062310A/en
Application granted granted Critical
Publication of CN111062310B publication Critical patent/CN111062310B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention belongs to the field of machine learning, and particularly relates to a few-sample unmanned aerial vehicle image identification method based on virtual sample generation, which is high in practicability. According to the invention, a short video of a section of unmanned aerial vehicle flying with N frames is shot from ground to air in a long distance through a camera device, N unmanned aerial vehicle regions are obtained as positive samples, and interference small regions such as trees, buildings, clouds, birds, kites, balloons and the like are combined with other relevant videos to be used as negative samples to be used as training sample sets. The invention has the beneficial effects that: under the condition of not excessively distorting, the effective information and diversity of the sample are increased, so that the generalization capability of the model is improved; in the identification algorithm part, the rapid DPM model adopts a component model which fixes the position of a feature graph and an anchor point input by a root filter and accords with the modularized features of the unmanned aerial vehicle, and the running speed and the accuracy are improved.

Description

Few-sample unmanned aerial vehicle image identification method based on virtual sample generation
Technical Field
The invention belongs to the field of machine learning, and particularly relates to a few-sample unmanned aerial vehicle image identification method based on virtual sample generation, which is high in practicability.
Background
In recent years, the application and the prospect of the unmanned aerial vehicle industry are concerned by various industries, the number of unmanned aerial vehicles is suddenly increased, and the potential safety hazard comes with the increase. The unmanned aerial vehicle flies disorderly and black, invades the privacy of citizens, is used for criminal activities, threatens the frequent occurrence of events such as air channel safety, and consequently, monitors and controls the unmanned aerial vehicle and seems to be more important. The unmanned aerial vehicle is identified and monitored by using the ubiquitous ultra-clear camera, so that the method is an effective supervision mode, and the method has important practical significance in researching the unmanned aerial vehicle detection and identification method based on images. On one hand, the difficulty of unmanned aerial vehicle image identification is that the size of an image is extremely small, the inter-class distance is large, the unmanned aerial vehicle image size is extremely small and the size difference is large due to the long distance and the large movement range, and the difference between the unmanned aerial vehicle images is obvious due to the weather environment, the flight height, the shooting angle and the like; on the other hand, the unmanned aerial vehicle identification task has the reasons of few learning samples, various unmanned aerial vehicles, various scenes, difficulty in obtaining samples and the like, so that the sample set is insufficient, and the generalization capability of the identification model is not high.
Small target detection and identification methods are always in the frontier field of academic research, and mainly comprise deep learning and traditional machine learning methods. The FPN model in the deep learning has good effect on the small target detection and identification tasks with scale change, the YOLO series algorithm is more prominent in the performance of detecting and identifying small targets, but the good performance of the algorithm is established on the basis of sufficient sample size, and the model performance is not prominent under the condition of few samples; the performance of the traditional machine learning method is mainly determined by feature extraction and a classifier, the scheme of the HOG feature and the SVM model is highlighted in image tasks such as small target detection and recognition, and the DPM model combining the SVM and the LatentSVM has better performance than the SVM.
The field of few-sample learning is one of the current machine learning frontier fields, wherein a virtual sample generation method for constructing a large number of virtual samples to improve the recognition capability of a model is an effective method for solving the problem of insufficient sample amount. The virtual sample generation method of the image has a plurality of methods, and the simplest and most effective method is a sample enhancement method such as random cutting, overturning, rotating, resampling and the like; the QR decomposition reconstruction method utilizes the independence of the maximized matrix column vector and the thought of reserving main components to generate a virtual sample reserving the main components of the original image; the virtual sample generated by the weighted fusion method can keep the feature distribution in a plurality of original images, so that the classifier learns the main feature distribution information of the sample; GAN is a new hot method of image generation that can generate "spurious" virtual samples.
Disclosure of Invention
The invention aims to provide a few-sample unmanned aerial vehicle image identification method based on virtual sample generation.
The purpose of the invention is realized as follows:
a few-sample unmanned aerial vehicle image identification method based on virtual sample generation comprises the following steps:
step 1: the method comprises the steps that a short video of unmanned aerial vehicle flight with N frames is shot in a ground-to-air long distance mode through a camera device, N unmanned aerial vehicle regions are obtained and serve as positive samples, and interference small regions such as trees, buildings, clouds, birds, kites and balloons are combined with other relevant videos and serve as negative samples and serve as training sample sets;
step 2: generating virtual samples by adopting a convolution network-based W-GAN method, training a GAN model by utilizing positive sample images with the total number of N, and randomly generating the virtual samples with the number of 3N through a generator in the model;
and step 3: dividing the positive sample images with the total number of N into 0.5N groups in a mode of two positive sample images in each group, generating 4 images for each group of samples by using a QR reconstruction weighted fusion method, and obtaining 2N virtual samples in total;
and 4, step 4: acquiring virtual samples with the number of 4N by using other virtual sample generation methods including mirror image turning and resampling on the positive sample images with the total number of N;
and 5: the virtual samples with the number of 9N are generated and used for improving the generalization capability of the recognition model, the virtual samples are used as positive samples to be added into a training set, a rapid DPM model is trained, and the recognizer capable of accurately recognizing the images of the unmanned aerial vehicle under the condition of few samples is obtained.
The step 2 of the convolution network-based W-GAN model generation method specifically includes:
(2-a) the generator network structure in the model is composed of 2 full-connection layers and 2 deconvolution layers, and the network result specifically comprises:
the 1 st layer is a fully connected layer activated by using a ReLU function, and the output is a 1024-sized vector;
the 2 nd layer is a full connection layer activated by using a ReLU function, outputs a vector with the size of 8192, and then is recombined into a multidimensional vector group with the size of 8 multiplied by 128;
the 3 rd layer is an deconvolution layer with convolution kernel of 4 x 4 and step length of 2, and is activated by using a ReLU function, and a graph with size of 16 x 64 is output;
the 4 th layer is an deconvolution layer with convolution kernel of 4 x 4 and step length of 2, and is activated by using Tanh function, and a graph with size of 32 x 3 is output;
(2-b) the network structure of the discriminator in the model is composed of 2 convolution layers and 2 full connection layers, and the network structure specifically comprises the following steps:
the layer 1 is a convolution layer with convolution kernel of 4 × 4 and step length of 2, and is activated by using LeakyReLU function, and a graph with size of 16 × 16 × 128 is output;
the 2 nd layer is a convolution layer with convolution kernel of 4 x 4 and step length of 2, and is activated by using LeakyReLU function, and a graph with size of 8 x 256 is output and then recombined into a vector with size of 16384;
the 3 rd layer is a fully connected layer activated by using a LeakyReLU function, and outputs a 1024-sized vector;
the 4 th layer is a full connection layer which does not use an activation function, and outputs a vector with the size of 1;
and (2-c) training the models by using the N positive sample sets to obtain trained generators, and randomly generating 3N virtual samples by using the generators.
The specific steps of the QR reconstruction weighted fusion method for generating the virtual sample in step 3 include: (3-a) according to a unitary matrix Q and an upper triangular matrix R obtained by QR decomposition, directly reconstructing to obtain a complete information graph
I(Q,R)=Q·R
Due to the characteristics of the upper triangular array R, the ith column vector of the corresponding matrix Q contributes to the element values from the ith column to the last column of the original image matrix without influencing the element values before the ith column, so the integrity of the element values of the information graph is gradually increased leftwards; for one of the images IlQR decomposition is performed to calculate a unitary matrix Q containing image informationlAnd upper triangular array Rl(ii) a And reconstructing by using the matrix Q, R and a reconstruction coefficient w to obtain a left information map I:
I(Ql,Rl,w)=Ql32×(32·w)·Rl(32·w)×32
4 reconstruction coefficients 0.25, 0.5, 0.75 and 1.0 respectively obtain 4 corresponding left information graphs; (3-b) Another image I within the grouprFirstly, mirror image inversion is carried out to obtain symmetrical images, and the unitary matrix Q is obtained by adopting the same processing mode as (3-a)rAnd upper triangular array RrFurther, the left information map of the symmetrical image is obtained, and the 4 right information maps I are obtained by mirror image turningG
IG(Qr,Rr,w)=G(Qr32×(32·w)·Rr(32·w)×32)
Wherein G (-) is to flip the matrix mirror image;
(3-c) mapping the left information chart I and the right information chart I under the same reconstruction coefficientGAnd (3) performing weighted fusion:
Figure BDA0002316705610000031
obtaining 4 fused virtual images Iw(ii) a And finally, carrying out QR reconstruction weighted fusion on the images with the number of 0.5N groups to obtain 2N virtual samples.
The fast DPM model in step 5 specifically includes:
(5-a) selecting 5 component filters of the DPM model according to the obvious characteristics of four wings and a fuselage of the unmanned aerial vehicle;
(5-b) calculating a two-layer HOG feature pyramid of the image, wherein the bottom-layer feature map is used as the input of a root filter, and the top-layer feature map is used as the input of a component filter;
(5-c) Anchor fixing of root Filter to image center (x)0,y0) The total response score of the image is formulated as:
Figure BDA0002316705610000032
Figure BDA0002316705610000033
wherein R is0Scoring a root filter, RiIn order to score the component filter,
Figure BDA0002316705610000034
is the distance between the root filter anchor point and the detection point of the component.
The invention has the beneficial effects that: in the virtual sample generation method, a W-GAN based convolution network is adopted to enable a model to generate a high-quality unmanned aerial vehicle virtual image, the virtual image generated by the QR reconstruction weighting fusion method has the distribution of the characteristics of two images, and under the condition of no excessive distortion, the effective information and diversity of the sample are increased, so that the generalization capability of the model is improved; in the identification algorithm part, the rapid DPM model adopts a component model which fixes the position of a feature graph and an anchor point input by a root filter and accords with the modularized features of the unmanned aerial vehicle, and the running speed and the accuracy are improved.
Drawings
Fig. 1 is a schematic flow chart of a method for identifying images of a small-sample unmanned aerial vehicle based on virtual sample generation according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a convolutional network-based W-GAN framework provided by an embodiment of the present invention;
FIG. 3 is a sample of a real image and a W-GAN generated virtual sample image provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of a QR reconstruction weighted fusion method provided by an embodiment of the present invention;
fig. 5 is a diagram of an effect of a virtual sample generated by a QR reconstruction weighted fusion method according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a fast DPM model framework provided by an embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Aiming at the defects in the prior art, the invention provides a few-sample unmanned aerial vehicle image recognition method based on virtual sample generation, which improves the condition of insufficient model generalization capability of few samples, and solves the unmanned aerial vehicle recognition problem under the condition of few samples by combining the modularization characteristics of unmanned aerial vehicle images. In order to achieve the above object, the method scheme of the invention is as follows:
a few-sample unmanned aerial vehicle image identification method based on virtual sample generation is characterized by comprising the following steps:
(1) the method comprises the steps that a short video of unmanned aerial vehicle flight with N frames is shot in a ground-to-air long distance mode through a camera device, N unmanned aerial vehicle regions are obtained and serve as positive samples, and interference small regions such as trees, buildings, clouds, birds, kites and balloons are combined with other relevant videos and serve as negative samples and serve as training sample sets;
(2) generating virtual samples by adopting a convolution network-based W-GAN method, training a GAN model by utilizing positive sample images with the total number of N, and randomly generating the virtual samples with the number of 3N through a generator in the model;
(3) dividing the positive sample images with the total number of N into 0.5N groups in a mode of two positive sample images in each group, generating 4 images for each group of samples by using a QR reconstruction weighted fusion method, and obtaining 2N virtual samples in total;
(4) acquiring virtual samples with the number of 4N by using other virtual sample generation methods including mirror image turning and resampling on the positive sample images with the total number of N;
(5) the virtual samples with the number of 9N are generated and used for improving the generalization capability of the recognition model, the virtual samples are used as positive samples to be added into a training set, a rapid DPM model is trained, and the recognizer capable of accurately recognizing the images of the unmanned aerial vehicle under the condition of few samples is obtained.
Further, the method for generating the W-GAN model based on the convolutional network specifically includes:
(2-a) the generator network structure in the model is composed of 2 full-connection layers and 2 deconvolution layers, and the network result specifically comprises:
the 1 st layer is a fully connected layer activated by using a ReLU function, and the output is a 1024-sized vector;
the 2 nd layer is a full connection layer activated by using a ReLU function, outputs a vector with the size of 8192, and then is recombined into a multidimensional vector group with the size of 8 multiplied by 128;
the 3 rd layer is an deconvolution layer with convolution kernel of 4 x 4 and step length of 2, and is activated by using a ReLU function, and a graph with size of 16 x 64 is output;
layer 4 is an deconvolution layer with a convolution kernel of 4 × 4 and a step size of 2, and is activated using the Tanh function, outputting a graph of size 32 × 32 × 3.
(2-b) the network structure of the discriminator in the model is composed of 2 convolution layers and 2 full connection layers, and the network structure specifically comprises the following steps:
the layer 1 is a convolution layer with convolution kernel of 4 × 4 and step length of 2, and is activated by using LeakyReLU function, and a graph with size of 16 × 16 × 128 is output;
the 2 nd layer is a convolution layer with convolution kernel of 4 x 4 and step length of 2, and is activated by using LeakyReLU function, and a graph with size of 8 x 256 is output and then recombined into a vector with size of 16384;
the 3 rd layer is a fully connected layer activated by using a LeakyReLU function, and outputs a 1024-sized vector;
layer 4 is a fully connected layer that does not use an activation function, and outputs a vector of size 1.
And (2-c) training the models by using the N positive sample sets to obtain trained generators, and randomly generating 3N virtual samples by using the generators.
Further, the specific steps of the QR reconstruction weighted fusion method for generating the virtual sample include:
(3-a) according to a unitary matrix Q and an upper triangular matrix R obtained by QR decomposition, directly reconstructing to obtain a complete information graph:
I(Q,R)=Q·R
due to the characteristics of the upper triangular array R, the ith column vector of the corresponding matrix Q contributes to the element values of the ith to the last column of the original image matrix without affecting the element values before the ith column, and therefore, the integrity of the element values of the information map gradually increases to the left.
For one of the images IlQR decomposition is performed to calculate a unitary matrix Q containing image informationlAnd upper triangular array Rl. And reconstructing by using the matrix Q, R and a reconstruction coefficient w to obtain a left information map I:
I(Ql,Rl,w)=Ql32×(32·w)·Rl(32·w)×32
the 4 reconstruction coefficients 0.25, 0.5, 0.75 and 1.0 respectively obtain 4 corresponding left information maps.
(3-b) Another image I within the grouprFirstly, mirror image inversion is carried out to obtain symmetrical images, and the unitary matrix Q is obtained by adopting the same processing mode as (3-a)rAnd upper triangular array RrFurther, the left information map of the symmetrical image is obtained, and the mirror image inversion is carried out again to obtain 4 right information maps IG
IG(Qr,Rr,w)=G(Qr32×(32·w)·Rr(32·w)×32)
Wherein G (-) is the inversion of the matrix mirror image.
(3-c) mapping the left information chart I and the right information chart I under the same reconstruction coefficientGAnd (3) performing weighted fusion:
Figure BDA0002316705610000061
obtaining 4 fused virtual images Iw
And finally, carrying out QR reconstruction weighted fusion on the images with the number of 0.5N groups to obtain 2N virtual samples.
Further, the fast DPM model specifically includes:
and (5-a) according to the obvious characteristics of four wings and a fuselage of the unmanned aerial vehicle, the number of component filters of the DPM model is selected to be 5.
And (5-b) calculating a two-layer HOG feature pyramid of the image, wherein the bottom-layer feature map is used as the input of a root filter, and the top-layer feature map is used as the input of a component filter.
(5-c) Anchor fixing of root Filter to image center (x)0,y0) The total response score of the image is formulated as:
Figure BDA0002316705610000062
Figure BDA0002316705610000063
wherein R is0Scoring a root filter, RiIn order to score the component filter,
Figure BDA0002316705610000064
is the distance between the root filter anchor point and the detection point of the component.
The invention provides a few-sample unmanned aerial vehicle image identification method based on virtual sample generation. The method comprises the steps of firstly expanding the number of samples by using a virtual sample generation method such as W-GAN and QR reconstruction weighted fusion based on a convolutional network, improving the condition of insufficient generalization capability of the model under the condition of few samples, then identifying the unmanned aerial vehicle image to be identified by using a rapid DPM model, and fully utilizing the modularization characteristic of the unmanned aerial vehicle image to ensure the accuracy and robustness of identification. It can be seen that in the virtual sample generation method, a W-GAN based on a convolutional network is adopted to enable a model to generate a high-quality unmanned aerial vehicle virtual image, the virtual image generated by the QR reconstruction weighting fusion method has the distribution of the characteristics of two images, and under the condition of no excessive distortion, the effective information and diversity of the sample are increased, so that the generalization capability of the model is improved; in the identification algorithm part, the rapid DPM model adopts a component model which fixes the position of a feature graph and an anchor point input by a root filter and accords with the modularized features of the unmanned aerial vehicle, and the running speed and the accuracy are improved.
In order that the objects, method aspects and advantages of the present invention will become more apparent, the method aspects of the present invention will be described in detail, with reference to all of the accompanying drawings in the following detailed description. It should be understood that the described embodiments are merely illustrative of the invention, rather than all embodiments and are not limiting of the invention.
Due to the reasons of multiple types, large environmental influence on imaging and the like, the identification task of the unmanned aerial vehicle is under the condition of few samples; a section of unmanned aerial vehicle flight video which cannot correctly identify the target is acquired, and after processing, the unmanned aerial vehicle can be accurately identified by the identifier. Fig. 1 is a schematic flow chart of a method for identifying an image of a small-sample unmanned aerial vehicle based on virtual sample generation according to an embodiment of the present invention, and the specific implementation steps are as follows:
s1, remotely shooting a section of short video flying by the unmanned aerial vehicle with the number of N from ground to air through the camera device, acquiring N unmanned aerial vehicle regions as positive samples, collecting small interference regions such as trees, buildings, clouds, birds, kites and balloons as negative samples by combining other related videos, and uniformly zooming to the size of 32 multiplied by 32 to be used as a training sample set.
The unmanned aerial vehicle positive sample is a rectangular area containing the unmanned aerial vehicle, which is intercepted by manual or other means, and the image shows a larger difference due to factors such as flight attitude, environment and the like. Various interference small areas are used as negative samples, and besides the interference of small objects in the air, partial areas of trees, buildings and clouds are easy to become the interference of the algorithm. All sample images are scaled to 32 x 32, so that the image quality is not influenced, and the subsequent algorithm can conveniently process.
And S2, generating virtual samples by using a convolution network-based W-GAN method, training a GAN model by using the positive sample images with the total number of N, and randomly generating the virtual samples with the number of 3N by using a generator in the model.
The currently popular generation-assisted Networks (GAN) is a reliable method for generating dummy samples. GAN has the problems of difficult training and easy pattern collapse, and the quality of the generated images is greatly different. In order to generate a positive sample picture of the unmanned aerial vehicle with better quality, the method provides a W-GAN method based on a convolutional network. The W-GAN is composed of a generator G and a discriminator D, and the objective function is as follows:
Figure BDA0002316705610000071
fig. 2 is a schematic diagram of a convolutional network-based W-GAN framework provided in an embodiment of the present invention, where a generator and a discriminator in a model are of mutually symmetric convolutional network structures, so as to improve model performance and adapt to an unmanned aerial vehicle image with low resolution. Wherein, the network structure of generator G is that 2 full connection layers and 2 deconvolution layers constitute, specifically is:
the 1 st layer is a fully connected layer activated by using a ReLU function, and the output is a 1024-sized vector;
the 2 nd layer is a full connection layer activated by using a ReLU function, outputs a vector with the size of 8192, and then is recombined into a multidimensional vector group with the size of 8 multiplied by 128;
the 3 rd layer is an deconvolution layer with convolution kernel of 4 x 4 and step length of 2, and is activated by using a ReLU function, and a graph with size of 16 x 64 is output;
layer 4 is an deconvolution layer with a convolution kernel of 4 × 4 and a step size of 2, and is activated using the Tanh function, outputting a graph of size 32 × 32 × 3.
The network structure of the discriminator D is composed of 2 convolution layers and 2 full-connection layers, and the network structure specifically comprises:
the layer 1 is a convolution layer with convolution kernel of 4 × 4 and step length of 2, and is activated by using LeakyReLU function, and a graph with size of 16 × 16 × 128 is output;
the 2 nd layer is a convolution layer with convolution kernel of 4 x 4 and step length of 2, and is activated by using LeakyReLU function, and a graph with size of 8 x 256 is output and then recombined into a vector with size of 16384;
the 3 rd layer is a fully connected layer activated by using a LeakyReLU function, and outputs a 1024-sized vector;
layer 4 is a fully connected layer that does not use an activation function, and outputs a vector of size 1.
And (3) training the W-GAN model based on the convolutional network by using the N positive sample sets, obtaining a trained generator, and then randomly generating 3N virtual samples. FIG. 3(a) is a diagram illustrating an actual image sample of a positive sample set according to an embodiment of the present invention; fig. 3(b) shows a sample of a virtual sample generated by the original W-GAN method, which easily shows that the generated virtual sample image has the problems of large noise and poor image quality; fig. 3(c) shows a sample of a virtual sample generated by the convolutional network-based W-GAN method, and it can be seen that the image generated by the method has a higher image similarity with the original positive sample, so that the noise problem is solved, the quality of the virtual image is improved, and the model has completely learned the feature distribution of the original positive sample.
And S3, dividing the positive sample images with the total number of N into 0.5N groups in a mode of two positive sample images per group, generating 4 images for each group of samples by using a QR reconstruction weighted fusion method, and obtaining 2N virtual samples in total.
Fig. 4 is a schematic diagram of a QR reconstruction weighted fusion method provided in an embodiment of the present invention, where a total number of N positive sample images are divided into 0.5N groups, each group has two images, and each group is subjected to QR reconstruction weighted fusion, specifically:
firstly, for one of the images IlQR decomposition is performed to calculate a unitary matrix Q containing image informationlAnd upper triangular array Rl. The matrices Q and R may be recombined into a complete information graph that is completely the same as the original image by multiplication, as follows:
I(Q,R)=Q·R
due to the characteristics of the upper triangular array R, the ith column vector of the corresponding matrix Q contributes to the element values of the ith to the last column of the original image matrix without affecting the element values before the ith column, and therefore, the integrity of the element values of the information map gradually increases to the left. Therefore, the first 32 × w columns can be retained by the reconstruction coefficient w to determine Q, R the amount of information contained in the matrix. Due to the complete property of the information graph obtained by QR decomposition in the left direction, the matrix Q is utilizedl、RlAnd calculating the reconstruction coefficient w to obtain a left information graph I, which is shown as the following formula:
I(Ql,Rl,w)=Ql32×(32·w)·Rl(32·w)×32
the 4 reconstruction coefficients 0.25, 0.5, 0.75 and 1 respectively obtain 4 corresponding left information maps.
Then, another image I in the grouprFirstly, mirror image inversion is carried out to obtain symmetrical images, and the unitary matrix Q is obtained by adopting the same processing mode as the previous steprAnd upper triangular array RrFurther, the left information map of the symmetrical image is obtained, and the 4 right information maps I are obtained by mirror image turningGAs shown in the following formula:
IG(Qr,Rr,w)=G(Qr32×(32·w)·Rr(32·w)×32)
wherein G (-) is the inversion of the matrix mirror image.
And finally, performing weighted fusion on the left information graph and the right information graph under the same reconstruction coefficient, as shown in the following formula:
Figure BDA0002316705610000081
obtaining a fused virtual image Iw
And obtaining 2N virtual samples after QR reconstruction weighted fusion of the N positive sample images. Fig. 5 shows an effect diagram of a virtual sample generated by the QR reconstruction weighted fusion method according to the embodiment of the present invention.
And S4, for the positive sample images with the total number of N, generating 4 images for each sample by using other virtual sample generation methods, and obtaining 4N virtual samples in total.
S41, obtaining N virtual image samples for N positive sample images by using a mirror image overturning method.
An image I is subjected to mirror image inversion, and a symmetrical virtual image I' (x) is obtained as shown in the following formulai,yj):
I'(xi,yj)=I(x27-i,yj)
The virtual image obtained by the mirror image turning method is a picture of the original image which is symmetrical about the y axis, and due to the symmetry characteristic of the unmanned aerial vehicle, the virtual image is very close to the real image.
And S42, obtaining 3N virtual image samples from 3 resampling coefficients for N positive sample images by using a resampling method.
An image is resampled, the image is downsampled according to 3 sampling coefficients of 0.4, 0.6 and 0.8 respectively to obtain a reduced image, then the up-sampling is carried out by utilizing a bilinear interpolation method, and the obtained 3 resampled images are virtual samples. The images obtained by the resampling method can be regarded as images captured when the distance of the unmanned aerial vehicle becomes long, and are also very close to real images, so that the multi-scale characteristic of the sample set is enhanced.
And S5, generating 9N virtual samples for improving the generalization capability of the recognition model, adding the virtual samples into a training set as positive samples, training a rapid DPM model, and obtaining a recognizer capable of accurately recognizing the unmanned aerial vehicle image under the condition of few samples.
The number of virtual samples obtained by the virtual sample generation method is 9N, the number of positive samples is 10N in total by combining the original positive samples, and a new sample set is obtained by combining the original negative samples.
DPM (Deformable Parts Model) is a very successful target detection algorithm, which calculates a matching score of a target by a root template and a part template, and performs classification. The DPM model firstly calculates a multi-layer HOG characteristic pyramid of an input image, and then finds the maximum score position of a target through sliding window searching. However, the scheme has a large number of similar matching calculations, and the algorithm is time-consuming. Due to the modular characteristic of the unmanned aerial vehicle, the idea of combining DPM component matching and overall matching is very suitable for the identification task of the target of the unmanned aerial vehicle. In order to further optimize the operation speed of the algorithm, the invention provides a fast DPM model, and fig. 6 is a schematic diagram of a fast DPM model framework provided by an embodiment of the invention, specifically:
s51, selecting 5 component filters of the DPM model according to obvious characteristics of four wings and a fuselage of the unmanned aerial vehicle.
And S52, calculating a two-layer HOG feature pyramid of the image, wherein the bottom-layer feature map is used as the input of the root filter, and the top-layer feature map is used as the input of the component filter.
S53, fixing anchor points of root filters to be image centers (x)0,y0) The response value score formula of the image is shown as follows:
Figure BDA0002316705610000101
Figure BDA0002316705610000102
wherein R is0Scoring a root filter, RiIn order to score the component filter,
Figure BDA0002316705610000103
is the distance between the root filter anchor point and the detection point of the component.
And after the DPM model is trained, obtaining the trained unmanned aerial vehicle recognizer. The identification process comprises the following steps: inputting an image, calculating two layers of HOG feature pyramids, calculating scores of bottom layer feature graphs by a root filter, calculating response values obtained by convolution by performing sliding search on top layer feature graphs by 5 component filters, and searching for an optimal position; if the total response score of the image is higher than the threshold value, judging that the target is the unmanned aerial vehicle target; otherwise, the target is judged to be a non-unmanned aerial vehicle target.
In conclusion, the virtual sample generation method is combined with the identification method of the DPM model, is suitable for the unmanned aerial vehicle small image identification task under the condition of few samples, and has strong practicability; the W-GAN model based on the convolutional network can generate a high-quality virtual image of the unmanned aerial vehicle, the virtual image generated by the QR reconstruction weighted fusion method has the distribution of the characteristics of two images, and under the condition of no excessive distortion, the effective information and diversity of a sample are increased, so that the generalization capability of the model is improved; the rapid DPM model adopts a feature diagram input by a fixed root filter, the position of an anchor point and a component model according with the modularized feature of the unmanned aerial vehicle, and improves the running speed and accuracy.
The basic principles, main features and practical features of the image recognition method of the unmanned aerial vehicle with few samples are described above, and those skilled in the art should understand that the above description of the embodiments is only for helping understanding the method technology and core idea of the present invention, and not for limiting the present invention, and meanwhile, according to the idea of the present application, there are changes in the specific implementation and application scope, and these changes all fall into the protection scope of the present invention.

Claims (4)

1. A few-sample unmanned aerial vehicle image identification method based on virtual sample generation is characterized by comprising the following steps:
step 1: the method comprises the steps that a short video of unmanned aerial vehicle flight with N frames is shot in a ground-to-air long distance mode through a camera device, N unmanned aerial vehicle regions are obtained and serve as positive samples, and interference small regions such as trees, buildings, clouds, birds, kites and balloons are combined with other relevant videos and serve as negative samples and serve as training sample sets;
step 2: generating virtual samples by adopting a convolution network-based W-GAN method, training a GAN model by utilizing positive sample images with the total number of N, and randomly generating the virtual samples with the number of 3N through a generator in the model;
and step 3: dividing the positive sample images with the total number of N into 0.5N groups in a mode of two positive sample images in each group, generating 4 images for each group of samples by using a QR reconstruction weighted fusion method, and obtaining 2N virtual samples in total;
and 4, step 4: acquiring virtual samples with the number of 4N by using other virtual sample generation methods including mirror image turning and resampling on the positive sample images with the total number of N;
and 5: the virtual samples with the number of 9N are generated and used for improving the generalization capability of the recognition model, the virtual samples are used as positive samples to be added into a training set, a rapid DPM model is trained, and the recognizer capable of accurately recognizing the images of the unmanned aerial vehicle under the condition of few samples is obtained.
2. The method for identifying the few-sample target based on the virtual sample generation as recited in claim 1, wherein the step 2 is a convolution network-based W-GAN model generation method, and specifically comprises:
(2-a) the generator network structure in the model is composed of 2 full-connection layers and 2 deconvolution layers, and the network result specifically comprises:
the 1 st layer is a fully connected layer activated by using a ReLU function, and the output is a 1024-sized vector;
the 2 nd layer is a full connection layer activated by using a ReLU function, outputs a vector with the size of 8192, and then is recombined into a multidimensional vector group with the size of 8 multiplied by 128;
the 3 rd layer is an deconvolution layer with convolution kernel of 4 x 4 and step length of 2, and is activated by using a ReLU function, and a graph with size of 16 x 64 is output;
the 4 th layer is an deconvolution layer with convolution kernel of 4 x 4 and step length of 2, and is activated by using Tanh function, and a graph with size of 32 x 3 is output;
(2-b) the network structure of the discriminator in the model is composed of 2 convolution layers and 2 full connection layers, and the network structure specifically comprises the following steps:
the layer 1 is a convolution layer with convolution kernel of 4 × 4 and step length of 2, and is activated by using LeakyReLU function, and a graph with size of 16 × 16 × 128 is output;
the 2 nd layer is a convolution layer with convolution kernel of 4 x 4 and step length of 2, and is activated by using LeakyReLU function, and a graph with size of 8 x 256 is output and then recombined into a vector with size of 16384;
the 3 rd layer is a fully connected layer activated by using a LeakyReLU function, and outputs a 1024-sized vector;
the 4 th layer is a full connection layer which does not use an activation function, and outputs a vector with the size of 1;
and (2-c) training the models by using the N positive sample sets to obtain trained generators, and randomly generating 3N virtual samples by using the generators.
3. The method for identifying the few-sample target based on the virtual sample generation as claimed in claim 1, wherein the specific step of generating the virtual sample by the QR reconstruction weighted fusion method in step 3 includes: (3-a) according to a unitary matrix Q and an upper triangular matrix R obtained by QR decomposition, directly reconstructing to obtain a complete information graph
I(Q,R)=Q·R
Due to the characteristics of the upper triangular array R, the ith column vector of the corresponding matrix Q contributes to the element values from the ith column to the last column of the original image matrix without influencing the element values before the ith column, so the integrity of the element values of the information graph is gradually increased leftwards; for one of the images IlQR decomposition is performed to calculate a unitary matrix Q containing image informationlAnd upper triangular array Rl(ii) a And reconstructing by using the matrix Q, R and a reconstruction coefficient w to obtain a left information map I:
I(Ql,Rl,w)=Ql32×(32·w)·Rl(32·w)×32
4 reconstruction coefficients 0.25, 0.5, 0.75 and 1.0 respectively obtain 4 corresponding left information graphs; (3-b) Another image I within the grouprFirstly, mirror image inversion is carried out to obtain symmetrical images, and the unitary matrix Q is obtained by adopting the same processing mode as (3-a)rAnd upper triangular array RrFurther, the left information map of the symmetrical image is obtained, and the 4 right information maps I are obtained by mirror image turningG
IG(Qr,Rr,w)=G(Qr32×(32·w)·Rr(32·w)×32)
Wherein G (-) is to flip the matrix mirror image;
(3-c) mapping the left information chart I and the right information chart I under the same reconstruction coefficientGAnd (3) performing weighted fusion:
Figure FDA0002316705600000021
obtaining 4 fused virtual images Iw(ii) a And finally, carrying out QR reconstruction weighted fusion on the images with the number of 0.5N groups to obtain 2N virtual samples.
4. The method for identifying the few-sample target based on the virtual sample generation as claimed in claim 1, wherein the fast DPM model in the step 5 specifically includes:
(5-a) selecting 5 component filters of the DPM model according to the obvious characteristics of four wings and a fuselage of the unmanned aerial vehicle;
(5-b) calculating a two-layer HOG feature pyramid of the image, wherein the bottom-layer feature map is used as the input of a root filter, and the top-layer feature map is used as the input of a component filter;
(5-c) Anchor fixing of root Filter to image center (x)0,y0) The total response score of the image is formulated as:
Figure FDA0002316705600000031
Figure FDA0002316705600000032
wherein R is0Scoring a root filter, RiIn order to score the component filter,
Figure FDA0002316705600000033
is the distance between the root filter anchor point and the detection point of the component.
CN201911280878.4A 2019-12-13 2019-12-13 Few-sample unmanned aerial vehicle image identification method based on virtual sample generation Active CN111062310B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911280878.4A CN111062310B (en) 2019-12-13 2019-12-13 Few-sample unmanned aerial vehicle image identification method based on virtual sample generation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911280878.4A CN111062310B (en) 2019-12-13 2019-12-13 Few-sample unmanned aerial vehicle image identification method based on virtual sample generation

Publications (2)

Publication Number Publication Date
CN111062310A true CN111062310A (en) 2020-04-24
CN111062310B CN111062310B (en) 2022-07-29

Family

ID=70300974

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911280878.4A Active CN111062310B (en) 2019-12-13 2019-12-13 Few-sample unmanned aerial vehicle image identification method based on virtual sample generation

Country Status (1)

Country Link
CN (1) CN111062310B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488948A (en) * 2020-04-29 2020-08-04 中国科学院重庆绿色智能技术研究院 Method for marking sparse samples in jitter environment
CN111817794A (en) * 2020-05-29 2020-10-23 中南民族大学 Multi-domain cooperative unmanned aerial vehicle detection method and system based on deep learning
CN112529114A (en) * 2021-01-13 2021-03-19 北京云真信科技有限公司 Target information identification method based on GAN, electronic device and medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845515A (en) * 2016-12-06 2017-06-13 上海交通大学 Robot target identification and pose reconstructing method based on virtual sample deep learning
US20170365038A1 (en) * 2016-06-16 2017-12-21 Facebook, Inc. Producing Higher-Quality Samples Of Natural Images
CN108495110A (en) * 2018-01-19 2018-09-04 天津大学 A kind of virtual visual point image generating method fighting network based on production
CN108681774A (en) * 2018-05-11 2018-10-19 电子科技大学 Based on the human body target tracking method for generating confrontation network negative sample enhancing
CN109145992A (en) * 2018-08-27 2019-01-04 西安电子科技大学 Cooperation generates confrontation network and sky composes united hyperspectral image classification method
US20190012581A1 (en) * 2017-07-06 2019-01-10 Nokia Technologies Oy Method and an apparatus for evaluating generative machine learning model
US20190080148A1 (en) * 2017-09-08 2019-03-14 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for generating image
US20190130221A1 (en) * 2017-11-02 2019-05-02 Royal Bank Of Canada Method and device for generative adversarial network training
CN110298450A (en) * 2019-05-21 2019-10-01 北京大学第三医院(北京大学第三临床医学院) A kind of virtual sample generation method based on production confrontation network
CN110348330A (en) * 2019-06-24 2019-10-18 电子科技大学 Human face posture virtual view generation method based on VAE-ACGAN
CN110378408A (en) * 2019-07-12 2019-10-25 台州宏创电力集团有限公司 Power equipment image-recognizing method and device based on transfer learning and neural network

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170365038A1 (en) * 2016-06-16 2017-12-21 Facebook, Inc. Producing Higher-Quality Samples Of Natural Images
CN106845515A (en) * 2016-12-06 2017-06-13 上海交通大学 Robot target identification and pose reconstructing method based on virtual sample deep learning
US20190012581A1 (en) * 2017-07-06 2019-01-10 Nokia Technologies Oy Method and an apparatus for evaluating generative machine learning model
US20190080148A1 (en) * 2017-09-08 2019-03-14 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for generating image
US20190130221A1 (en) * 2017-11-02 2019-05-02 Royal Bank Of Canada Method and device for generative adversarial network training
CN108495110A (en) * 2018-01-19 2018-09-04 天津大学 A kind of virtual visual point image generating method fighting network based on production
CN108681774A (en) * 2018-05-11 2018-10-19 电子科技大学 Based on the human body target tracking method for generating confrontation network negative sample enhancing
CN109145992A (en) * 2018-08-27 2019-01-04 西安电子科技大学 Cooperation generates confrontation network and sky composes united hyperspectral image classification method
CN110298450A (en) * 2019-05-21 2019-10-01 北京大学第三医院(北京大学第三临床医学院) A kind of virtual sample generation method based on production confrontation network
CN110348330A (en) * 2019-06-24 2019-10-18 电子科技大学 Human face posture virtual view generation method based on VAE-ACGAN
CN110378408A (en) * 2019-07-12 2019-10-25 台州宏创电力集团有限公司 Power equipment image-recognizing method and device based on transfer learning and neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
于旭等: "虚拟样本生成技术研究", 《计算机科学》 *
杨懿男等: "基于生成对抗网络的小样本数据生成技术研究", 《电力建设》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488948A (en) * 2020-04-29 2020-08-04 中国科学院重庆绿色智能技术研究院 Method for marking sparse samples in jitter environment
CN111817794A (en) * 2020-05-29 2020-10-23 中南民族大学 Multi-domain cooperative unmanned aerial vehicle detection method and system based on deep learning
CN111817794B (en) * 2020-05-29 2021-04-13 中南民族大学 Multi-domain cooperative unmanned aerial vehicle detection method and system based on deep learning
CN112529114A (en) * 2021-01-13 2021-03-19 北京云真信科技有限公司 Target information identification method based on GAN, electronic device and medium

Also Published As

Publication number Publication date
CN111062310B (en) 2022-07-29

Similar Documents

Publication Publication Date Title
US10719940B2 (en) Target tracking method and device oriented to airborne-based monitoring scenarios
CN111062310B (en) Few-sample unmanned aerial vehicle image identification method based on virtual sample generation
KR102362744B1 (en) Method for recognizing face using multiple patch combination based on deep neural network with fault tolerance and fluctuation robustness in extreme situation
CN108647655B (en) Low-altitude aerial image power line foreign matter detection method based on light convolutional neural network
CN112084868B (en) Target counting method in remote sensing image based on attention mechanism
CN107883947B (en) Star sensor star map identification method based on convolutional neural network
CN112070729B (en) Anchor-free remote sensing image target detection method and system based on scene enhancement
CN110136162B (en) Unmanned aerial vehicle visual angle remote sensing target tracking method and device
CN106534616A (en) Video image stabilization method and system based on feature matching and motion compensation
CN111506759B (en) Image matching method and device based on depth features
CN108805149A (en) A kind of winding detection method and device of visual synchronization positioning and map structuring
CN108537181A (en) A kind of gait recognition method based on the study of big spacing depth measure
CN110968734A (en) Pedestrian re-identification method and device based on depth measurement learning
CN112163498A (en) Foreground guiding and texture focusing pedestrian re-identification model establishing method and application thereof
CN111582091A (en) Pedestrian identification method based on multi-branch convolutional neural network
CN113495575B (en) Unmanned aerial vehicle autonomous landing visual guidance method based on attention mechanism
CN108320310A (en) Extraterrestrial target 3 d pose method of estimation based on image sequence
Shi et al. Remote sensing scene classification based on multibranch fusion attention network
CN109919215B (en) Target detection method for improving characteristic pyramid network based on clustering algorithm
CN111260687A (en) Aerial video target tracking method based on semantic perception network and related filtering
CN110472092B (en) Geographical positioning method and system of street view picture
CN113011308A (en) Pedestrian detection method introducing attention mechanism
Xu et al. Infrared image semantic segmentation based on improved deeplab and residual network
Xu et al. Adaptive remote sensing image attribute learning for active object detection
CN111027427B (en) Target gate detection method for small unmanned aerial vehicle racing match

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant