CN107862705B - Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics - Google Patents

Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics Download PDF

Info

Publication number
CN107862705B
CN107862705B CN201711166232.4A CN201711166232A CN107862705B CN 107862705 B CN107862705 B CN 107862705B CN 201711166232 A CN201711166232 A CN 201711166232A CN 107862705 B CN107862705 B CN 107862705B
Authority
CN
China
Prior art keywords
candidate
target
unmanned aerial
aerial vehicle
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711166232.4A
Other languages
Chinese (zh)
Other versions
CN107862705A (en
Inventor
高陈强
杜莲
王灿
冯琦
汤林
汪澜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201711166232.4A priority Critical patent/CN107862705B/en
Publication of CN107862705A publication Critical patent/CN107862705A/en
Application granted granted Critical
Publication of CN107862705B publication Critical patent/CN107862705B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics, and belongs to the technical field of image processing and computer vision. Firstly, processing an input video data set through a video image stabilization algorithm to compensate the motion of a camera; analyzing a detected motion candidate target area in the image; dividing a video data set into two parts, and training by utilizing a training data set to obtain an improved candidate area generation network model; generating a network through the candidate region based on the depth features obtained through training, and generating candidate targets for the video images of the test set through the network; fusing the candidate target areas; and training by utilizing a training data set to obtain a model of the deep neural network based on the double channels, and applying the model to obtain a recognition result. And applying a target tracking method based on the multilayer depth characteristics to the recognition result of the previous step to obtain the final position of the unmanned aerial vehicle. The invention can accurately detect the unmanned aerial vehicle in the video image and provides support for the subsequent research of the unmanned aerial vehicle intelligent monitoring related field.

Description

Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics
Technical Field
The invention belongs to the technical field of image processing and computer vision, and relates to an unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics.
Background
At present, with the rapid increase of the availability and maturity of commercial drones, the sale of drones is multiplied, and drones flying in public areas are common. Unmanned aerial vehicle not only appears in the camera lens of the popular comprehensive skill, on the romantic wedding ceremony, can also spray the pesticide in the sky above the farmland, replace the workman to carry out high altitude cleaning operation for survey and drawing shooting, forest fire prevention, military reconnaissance etc.. However, with the rapid development of unmanned aerial vehicles, the dangerous accidents caused by the unmanned aerial vehicles are also growing, and threats are brought to public safety, privacy disclosure, military safety and the like.
In recent years, techniques for detecting an unmanned aerial vehicle are roughly classified into sound Detection (Acoustics Detection), Radio Frequency Detection (Radio Frequency), Radar Detection (Radar Detection), Visual Detection (Visual Detection), and the like. Sound detection uses the microphone array to detect the rotor noise that unmanned aerial vehicle flies, then matches the noise that detects with the database that has recorded all unmanned aerial vehicle sounds, thereby discerns whether this noise belongs to unmanned aerial vehicle and judges whether there is unmanned aerial vehicle to be close to. The method for detecting the sound is easily interfered by environmental noise, and meanwhile, the time is consumed for constructing a database of the sound characteristics of the unmanned aerial vehicle. Radio frequency detection is to detect the drone by monitoring radio frequencies within a certain frequency range through a wireless receiver. This approach easily misinformates an unknown radio frequency transmitter as a drone. The radar detection is to judge whether the unmanned aerial vehicle is detected by detecting and verifying received electromagnetic waves scattered and reflected by a target. The cost and energy consumption of radar equipment are expensive and susceptible to environmental influences that create blind spots. Visual inspection typically detects drones by one or more imaging devices, and analyzes the sequence of images in some way to determine if a drone is present. Unmanned aerial vehicle detection based on vision is difficult for receiving ambient noise's interference, can fix a position unmanned aerial vehicle position, can also distinguish whether unmanned aerial vehicle carries dangerous goods, can detect information such as unmanned aerial vehicle's flight orbit, flying speed even. Therefore, the visual detection method has great advantages compared with other means, and can make up for the defects of other detection means.
At present, the research work of unmanned aerial vehicle detection based on vision is less. Obviously, it is more advantageous to avoid the danger of the drone in advance to detect the drone at a longer distance. Unmanned aerial vehicles compare target sizes such as pedestrian, aircraft, vehicle littleer, especially in remote imaging, unmanned aerial vehicle's size is very little, and this makes unmanned aerial vehicle detection based on vision more difficult. Therefore, a detection algorithm capable of effectively detecting the unmanned aerial vehicle small target in the video is needed at present.
Disclosure of Invention
In view of the above, the present invention aims to provide a method for detecting a small target of an unmanned aerial vehicle based on motion characteristics and deep learning characteristics, which utilizes a target tracking algorithm to track the unmanned aerial vehicle and filter out false targets, and combines the characteristics of small size of the unmanned aerial vehicle, etc. to improve a convolutional neural network structure, so that the deep learning algorithm is suitable for the situation of the small target, and can effectively detect the unmanned aerial vehicle in a complex scene, thereby improving the accuracy of unmanned aerial vehicle detection.
In order to achieve the purpose, the invention provides the following technical scheme:
an unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics comprises the following steps:
s1: processing the input video data set through a video image stabilization algorithm to compensate the motion of a camera;
s2: detecting a motion candidate target area I from the video image after motion compensation by using a low-rank matrix analysis method, and removing tiny noise points in the motion candidate target area I by using an image post-processing module;
s3: dividing a data set of a video into a training set and a testing set, and training by using the training set to obtain an improved candidate region generation network model; processing the video image of the test set through the improved area generation network model to generate a candidate target area II;
s4: fusing the candidate target area I and the candidate target area II to obtain a candidate target area III;
s5: according to the candidate target area III, training by using a training set to obtain a dual-channel-based deep neural network, and then applying the dual-channel-based deep neural network to the candidate target of the test set to obtain a recognition result;
s6: and predicting the position of the target by using a correlation filtering algorithm, tracking and matching the stable target, filtering out false targets and obtaining the position of the unmanned aerial vehicle.
Further, in step S1, the video image stabilization algorithm includes:
s11: extracting characteristic points of each frame image by using an SURF algorithm;
s12: calculating an affine transformation model between two frames through the obtained feature matching points between the two frames of images;
s13: and compensating the current frame by using the obtained affine transformation model.
Further, in step S2, the process of detecting the motion candidate target region i by using the low rank matrix analysis method includes the following steps:
s21: will input video sequence image data f1,f2,...,fnVectorization to form an image matrix
Figure BDA0001476344940000021
Where n is the number of video frames, fnFor the matrix of video images of the n-th frame,
Figure BDA0001476344940000022
is fnVectorizing the image matrix;
s22: decomposing the matrix C into a low-rank matrix L and a sparse matrix S through an RPCA algorithm, wherein the obtained low-rank matrix L represents a target background, and the sparse matrix S represents the obtained candidate moving target;
s23: and carrying out noise filtering processing on the obtained candidate moving target by utilizing morphological opening and closing operation, and filtering fine noise points in a moving candidate area.
Further, in step S3, the improved candidate area generation network model includes five convolutional layers and two fully-connected layers connected in sequence, wherein pooling layers are disposed between the first convolutional layer and the second convolutional layer, between the second convolutional layer and the third convolutional layer, and between the fifth layer and the first fully-connected layer;
step S3 specifically includes:
s31: dividing a data set of a video into a training set and a testing set;
s32: for the data of the training set, extracting manually marked positive samples in the image, and then randomly sampling a plurality of areas as negative samples;
s33: training by using positive and negative samples of a training set to obtain an improved candidate area generation network model;
s34: and processing the video image of the test set through the improved area generation network model to generate a candidate target area II.
Further, the width and height size ranges of the randomly sampled region in step S32 are determined by the width and height of the positive sample, and the overlapping region of the negative sample and the positive sample satisfies:
Figure BDA0001476344940000031
where IoU is the overlap ratio, rgIs a positive sample region, rnThe negative sample regions are randomly sampled.
Further, the step S4 of obtaining the candidate target region iii by fusion specifically includes:
s41: carrying out dense sampling on the candidate target area I to obtain a dense seed candidate area;
s42: calculating the similarity between the dense seed candidate region and the candidate target region II when the similarity meets the requirement
Figure BDA0001476344940000032
Combining two candidate regions, wherein Sim is the similarity between the dense seed candidate region and the candidate target region II;
s43: and traversing all the candidate target areas I to obtain a final candidate target area III.
Further, in step S5, the dual-channel-based deep neural network includes a front-end module and a back-end module;
the front-end module consists of two parallel deep neural network models, wherein one of the models takes a candidate target area as input directly and passes through a 6-layer convolutional neural network and 1 full-connection layer; the other one takes the candidate target area as the center, establishes an expansion area on the original image target area as input, and passes through a 6-layer convolutional neural network and 1 full-connection layer;
the rear-end module takes the output of the two full-connection layers obtained by the front-end module as input, and obtains the classification information of each candidate area as a final classification result through 2 full-connection layers and 1 softmax layer;
step S5 specifically includes:
s51: for the training data set, dividing the training data set of the candidate target area III obtained in the step S4 into positive and negative samples, and inputting the positive and negative samples into a two-channel-based deep neural network for training to obtain optimal weight;
s52: and applying the optimal weight to the candidate target areas of the test set obtained in the step S4 for classification, so as to obtain a final recognition result.
Further, step S6 specifically includes:
s61: the center position (x) of the target of the previous frame of the current frame t is knownt-1,yt-1) For the improved candidate region generation network model obtained by training in step S5, performing sparsification on the convolution feature map array obtained by the last three convolution layers of the improved candidate region generation network model, and then extracting the depth feature of the target by using the sparsified feature map;
s62: respectively constructing correlation filters for the output characteristics of the last three convolution layers of the improved candidate area generation network model, convolving the characteristics of each layer with the corresponding correlation filter from back to front, and calculating the corresponding confidence fraction f to obtain the new central position (x) of the candidate target in the current framet,yt);
S63: extracting depth features around the new center position, and updating parameters of the relevant filter;
s64: and in consideration of the stability and continuity of the target motion of the unmanned aerial vehicle, filtering the candidate target area track with the tracking frame number less than the threshold value, and finally obtaining the tracking target as the detection result of the unmanned aerial vehicle.
Further, the step of constructing the correlation filter is:
s621: and (3) setting the size of the output feature as M multiplied by N multiplied by D and the depth feature as x, constructing an objective function of the correlation filter:
Figure BDA0001476344940000041
wherein, w*Is an objective function of the correlation filter, w is the correlation filter, xm,nFor the feature at the (m, n) pixel, λ is the regularization parameter λ (λ ≧ 0), y (m, n) denotes the label of the pixel at (m, n);
y (m, n) obeys a two-dimensional gaussian distribution:
Figure BDA0001476344940000042
wherein σ is the width of the Gaussian kernel;
s622: converting the objective function into a frequency domain by using fast Fourier transform to obtain an optimal solution of the objective function,
Figure BDA0001476344940000043
fourier transform of Y indicates a Hadamard product, WdIs the optimal solution of the objective function,
Figure BDA0001476344940000044
taking the Fourier transform of the depth feature x, wherein i is the ith channel, D is the channel sequence, and D belongs to {1,2, …, D };
s623: given a candidate target region of the next frame image, for the depth feature z of the candidate region, the response map corresponding to the correlation filter is:
Figure BDA0001476344940000045
wherein F-1Which represents the fourier transform of the signal,
Figure BDA0001476344940000046
a fourier transform representing the depth feature z.
Further, the updating of the parameters of the correlation filter in step S63 satisfies:
Figure BDA0001476344940000051
Figure BDA0001476344940000052
Figure BDA0001476344940000053
Pt、Qtis an intermediate variable, WtIs the target function of the correlation filter of the updated t-th frame, t is the video frame number, and eta is the learning rate.
The invention has the beneficial effects that:
1) the invention provides a method for detecting an unmanned aerial vehicle based on the motion characteristics and deep learning characteristics of the unmanned aerial vehicle. The method can effectively detect the target under the conditions that the background is complex and the unmanned aerial vehicle is small.
2) The method improves the traditional deep neural network structure, and effectively solves the problem that the existing target detection algorithm based on the deep neural network is not suitable for small targets.
3) The method provides an online tracking algorithm based on multilayer depth features and related filters, and can better track and predict the trajectory of the unmanned aerial vehicle and filter false targets.
Drawings
In order to make the object, technical scheme and beneficial effect of the invention more clear, the invention provides the following drawings for explanation:
FIG. 1 is a schematic diagram of a method for detecting a small target of an unmanned aerial vehicle based on motion characteristics and deep learning characteristics according to the invention;
FIG. 2 is a schematic diagram of a video image stabilization algorithm;
FIG. 3 is a schematic diagram of a convolutional neural network structure;
FIG. 4 is a schematic diagram of candidate object generation using an improved area generation network;
FIG. 5 is a schematic diagram of a dual channel-based deep neural network;
FIG. 6 is a schematic diagram of an online tracking algorithm based on depth features.
Detailed Description
Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
In the invention, a candidate target detection module based on motion characteristics extracts a motion target area in a video through low-rank matrix analysis after video stabilization is carried out on an original video;
a candidate target detection module based on the depth characteristics extracts candidate targets from the video image through an improved area generation network model;
the improved candidate area generation network is characterized in that the network structure and the size of the candidate area are modified on the basis of the traditional area generation network, and the network layer of the output characteristic diagram is replaced;
the candidate region fusion module is used for fusing the candidate regions obtained in the steps S2 and S3;
the candidate target identification module based on the dual-channel deep neural network is used for improving a traditional deep neural network model to classify and identify candidate areas according to the characteristics of small targets to obtain a final identification result;
the depth feature-based online tracking algorithm improves the traditional artificial feature-based tracking algorithm, utilizes the features of the target extracted by the convolutional neural network, and has robustness.
Fig. 1 is a schematic diagram of a method for detecting a small target of an unmanned aerial vehicle based on motion characteristics and deep learning characteristics, and as shown in the figure, the method specifically includes the following steps:
step S1: firstly, processing a data set of an input original video through a video image stabilization algorithm to compensate camera motion, wherein a specific flow chart is shown in fig. 2:
s101: and extracting key points from the image by using an SURF algorithm, and constructing a SURF feature point descriptor.
S102: and calculating Euclidean distances of the corresponding feature points of the two frames of images, then selecting the minimum distance, setting a threshold value, keeping the matching points when the distance of the corresponding feature points is smaller than the threshold value, and otherwise, removing the matching points.
S103: the two frames of images are subjected to bidirectional matching, and by repeating the step S12, when the matched feature point pair coincides with the result obtained in the step S12, the final feature matching point is obtained.
S104: and setting the camera motion as an affine transformation model, and calculating by using a least square method according to the feature matching points obtained in the step to obtain the affine transformation model between the two frames of images.
S105: and according to the obtained affine transformation model, registering the current frame and the set reference frame to obtain the compensated current frame, storing the compensated current frame into a new video, and finally obtaining the stable video.
S106: and calculating the offset degree between the compensated current frame and the reference frame, if the offset degree is greater than a threshold value, updating the current frame to be the reference frame, and otherwise, continuously reading the next frame.
Step S2: and detecting a motion candidate target region of the compensated video by a low-rank matrix analysis method, and removing fine noise points in the motion candidate region by an image post-processing module.
Step S3: dividing a data set into a training set and a testing set, and training by using the training data set to obtain an improved candidate area generation network model; candidate targets are generated for the video images of the test set by the trained improved region generation network. The improved area generation network model structure is shown in fig. 4:
in step S3, training the improved candidate area generation network model using the positive and negative samples of the data set; generating candidate targets for the video images of the test set through the trained improved area generation network, specifically comprising the following processes:
firstly, aiming at the characteristics of an unmanned aerial vehicle, improving the traditional candidate area generation network structure to obtain an improved candidate area generation network, wherein the improved candidate area generation network modifies the network structure and the scale size of characteristic extraction, and replaces the network layer of an output characteristic diagram; then training by utilizing a training data set to obtain an improved candidate region to generate the optimal weight of the network model; and finally, applying the optimal weight to the test data set to obtain a candidate target rectangular frame.
The improved area generation Network is mainly additionally provided with two full convolution layers on the basis of a Convolutional Neural Network (CNN), wherein one full convolution layer is an area classification layer and used for judging whether a candidate area is a foreground target or a background, and the other full convolution layer is an area frame regression layer and used for predicting the position coordinates of the candidate area. The convolutional neural network is composed of five convolutional layers, three pooling layers and two fully-connected layers, as shown in fig. 3. The traditional area generation network usually processes the feature map generated by the last convolutional layer, but the small target often depends more on the shallow feature because the shallow feature has higher resolution, so the method changes the shallow feature into the fourth convolutional layer. And the area generation network slides on the feature map output by the fourth convolution layer through a sliding network, the sliding network is fully connected with 9 windows with different scales on the feature map each time, then the low-dimensional vector is mapped, and finally the low-dimensional vector is sent to two fully-connected layers to obtain the category and the position of the candidate target. Compared with the traditional area generation network, the method has the advantage that the size of 9 scales is reduced compared with the original size, so that the method is more beneficial to the detection of small targets.
Step S3 specifically includes:
s31: dividing a data set of a video into a training set and a testing set;
s32: for the data of the training set, extracting manually marked positive samples in the image, and then randomly sampling a plurality of areas as negative samples;
s33: training by using positive and negative samples of a training set to obtain an improved candidate area generation network model;
s34: and processing the video image of the test set through the improved area generation network model to generate a candidate target area II.
In step S32, the image is sampled by negative samples, the width and height of the sampled region are determined by the maximum (minimum) width and height of the positive samples, and the overlap ratio of the region of the negative samples and the positive samples cannot exceed the following conditions:
Figure BDA0001476344940000071
where IoU is the overlap ratio, rgIs a positive sample region, rnThe negative sample regions are randomly sampled.
Step S4: and performing dense sampling on the candidate target area obtained in the step S2 to obtain a denser candidate target frame, and then obtaining a final candidate target by fusing the dense candidate target frame with the candidate target obtained in the step S3.
The specific fusion mode comprises the following steps:
s41: taking the motion candidate region obtained in the step S2 as a seed candidate region, and performing further dense sampling on the seed candidate region to obtain a dense seed candidate region;
s42: calculating the similarity between the seed candidate region and the candidate region obtained in step S3, when the similarity is larger than mu (mu epsilon [0.6, 1)]) And meanwhile, combining the two candidate regions, and traversing all the seed candidate regions to obtain a final candidate region. The similarity Sim calculation formula of the area A and the area B is as follows:
Figure BDA0001476344940000081
step S5: according to the deep neural network model based on the double channels and aiming at the small target detection, the network model is obtained by training through a training data set, and then the network model is applied to candidate targets of a test set to obtain a recognition result. The structure of the deep neural network model based on the dual channels is shown in fig. 5:
the deep neural network model based on the double channels mainly comprises two parts, namely a front-end module and a rear-end module. The front-end module consists of two parallel deep neural network models, one of which takes a candidate target area as input directly and obtains 4096-dimensional characteristics through 6 convolution layers and 1 full-connection layer; and the other one takes an extended area of the target area which is 4 times that of the candidate target area as the center on the original drawing as input, and obtains 4096-dimensional characteristics through 6 convolutional layers and 1 full-connected layer. The back end module is used for inputting two 4096 characteristics obtained by the front end module in a string mode, and classification information of each candidate area is obtained through 2 full connection layers and 1 softmax layer to serve as a final classification result.
In step S5, according to the two-channel-based deep neural network for small target detection proposed by the method, the network model is obtained by training with a training data set, and then the network model is applied to candidate targets in a test set to obtain a recognition result, which specifically includes:
s51: for the training data set, dividing the candidate target region of the training data set obtained in the step S4 into positive and negative samples, and inputting the improved two-channel-based deep neural network training to obtain the optimal weight.
S52: and applying the optimal weight to the candidate target area of the test data set obtained in the step S4 for classification, so as to obtain a final recognition result.
Step S6: the target tracking method based on the depth features, which is provided by the method, is applied to the recognition result of the step S5, the position of the target is predicted by using a relevant filtering algorithm, and the matched stable target is tracked, so that the false target is filtered, and the final position of the unmanned aerial vehicle is obtained. The specific flow chart of the target tracking algorithm based on the depth features is shown in fig. 6:
s601: inputting a candidate target area of a previous frame of the current frame, firstly thinning a convolution feature map array obtained by the last three layers of convolution layers of the model by using the neural network model obtained by training in the step S5, and then extracting the depth feature of the target by using the thinned feature map;
s602: constructing corresponding correlation filters for the output characteristics of each convolution layer, performing convolution on the characteristics of each layer and the corresponding correlation filters from back to front, and calculating corresponding confidence scores to obtain the new position of the candidate target in the current frame;
s603: depth features are extracted around the new center position of the candidate object to update the parameters of the correlation filter.
S604: and in consideration of the stability and continuity of the target motion of the unmanned aerial vehicle, filtering the candidate target area track with the tracking frame number less than the threshold value, and finally obtaining the tracking target as the detection result of the unmanned aerial vehicle.
The threshold mentioned in step S604 has a value in the range of 5 to 20.
In step S6, the constructing a corresponding correlation filter for each of the M × N × D output features specifically includes:
firstly, let the depth feature with size M × N × D be x, and construct the objective function of the corresponding correlation filter as:
Figure BDA0001476344940000091
wherein, lambda (lambda is more than or equal to 0) is a regularization parameter; y (m, n) represents the label of the pixel at (m, n), which follows a two-dimensional gaussian distribution:
Figure BDA0001476344940000092
then, the objective function is converted into a frequency domain by using fast Fourier transform, and the optimal solution of the objective function can be derived as follows:
Figure BDA0001476344940000093
wherein Y is a Fourier transform of Y,. indicates a Hadamard product;
finally, after a candidate target region of the next frame image is given, for the depth feature z of the candidate region, the response map corresponding to the correlation filter is:
Figure BDA0001476344940000094
wherein, F-1Representing an inverse fourier transform.
Further, in step S6, the parameter W of the correlation filterdThe updating policy specifically includes:
Figure BDA0001476344940000095
Figure BDA0001476344940000096
Figure BDA0001476344940000097
where t is the video frame number and η is the learning rate.
Finally, it is noted that the above-mentioned preferred embodiments illustrate rather than limit the invention, and that, although the invention has been described in detail with reference to the above-mentioned preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the scope of the invention as defined by the appended claims.

Claims (6)

1. A small target detection method of an unmanned aerial vehicle based on motion characteristics and deep learning characteristics is characterized in that: the method comprises the following steps:
s1: processing the input video data set through a video image stabilization algorithm to compensate the motion of a camera;
s2: detecting a motion candidate target area I from the video image after motion compensation by using a low-rank matrix analysis method, and removing tiny noise points in the motion candidate target area I by using an image post-processing module;
s3: dividing a data set of a video into a training set and a testing set, and training by using the training set to obtain an improved candidate region generation network model; processing the video image of the test set through the improved area generation network model to generate a candidate target area II;
s4: fusing the candidate target area I and the candidate target area II to obtain a candidate target area III;
s5: according to the candidate target area III, training by using a training set to obtain a dual-channel-based deep neural network, and then applying the dual-channel-based deep neural network to the candidate target of the test set to obtain a recognition result;
s6: predicting the position of a target by using a relevant filtering algorithm, tracking and matching the stable target, and filtering out false targets to obtain the position of the unmanned aerial vehicle;
in step S3, the improved candidate area generation network model includes five convolutional layers and two fully-connected layers connected in sequence, wherein pooling layers are disposed between the first convolutional layer and the second convolutional layer, between the second convolutional layer and the third convolutional layer, and between the fifth layer and the first fully-connected layer;
step S3 specifically includes:
s31: dividing a data set of a video into a training set and a testing set;
s32: for the data of the training set, extracting manually marked positive samples in the image, and then randomly sampling a plurality of areas as negative samples;
s33: training by using positive and negative samples of a training set to obtain an improved candidate area generation network model;
s34: processing the video image of the test set through the improved area generation network model to generate a candidate target area II;
the width and height size range of the randomly sampled region in step S32 is determined by the width and height of the positive sample, and the overlapping region of the negative sample and the positive sample satisfies:
Figure FDA0002898007240000011
where IoU is the overlap ratio, rgIs a positive sample region, rnRandomly sampling a negative sample region;
the step S4 of obtaining the candidate target region iii by fusion specifically includes:
s41: carrying out dense sampling on the candidate target area I to obtain a dense seed candidate area;
s42: calculating the similarity between the dense seed candidate region and the candidate target region II when the similarity meets the requirement
Figure FDA0002898007240000021
Combining two candidate regions, wherein Sim is the similarity between the dense seed candidate region and the candidate target region II;
s43: traversing all the candidate target areas I to obtain a final candidate target area III;
in step S5, the dual-channel-based deep neural network includes a front-end module and a back-end module;
the front-end module consists of two parallel deep neural network models, wherein one of the models takes a candidate target area as input directly and passes through a 6-layer convolutional neural network and 1 full-connection layer; the other one takes the candidate target area as the center, establishes an expansion area on the original image target area as input, and passes through a 6-layer convolutional neural network and 1 full-connection layer;
the rear-end module takes the output of the two full-connection layers obtained by the front-end module as input, and obtains the classification information of each candidate area as a final classification result through 2 full-connection layers and 1 softmax layer;
step S5 specifically includes:
s51: for the training data set, dividing the training data set of the candidate target area III obtained in the step S4 into positive and negative samples, and inputting the positive and negative samples into a two-channel-based deep neural network for training to obtain optimal weight;
s52: and applying the optimal weight to the candidate target areas of the test set obtained in the step S4 for classification, so as to obtain a final recognition result.
2. The unmanned aerial vehicle small target detection method based on the motion characteristic and the deep learning characteristic as claimed in claim 1, characterized in that: in step S1, the video image stabilization algorithm includes:
s11: extracting characteristic points of each frame image by using an SURF algorithm;
s12: calculating an affine transformation model between two frames through the obtained feature matching points between the two frames of images;
s13: and compensating the current frame by using the obtained affine transformation model.
3. The unmanned aerial vehicle small target detection method based on the motion characteristic and the deep learning characteristic as claimed in claim 1, characterized in that: in step S2, the process of detecting the motion candidate target region i by using the low rank matrix analysis method includes the following steps:
s21: will input video sequence image data f1,f2,...,fnVectorization to form an image matrix
Figure FDA0002898007240000022
Where n is the number of video frames, fnFor the matrix of video images of the n-th frame,
Figure FDA0002898007240000023
is fnVectorizing the image matrix;
s22: decomposing the matrix C into a low-rank matrix L and a sparse matrix S through an RPCA algorithm, wherein the obtained low-rank matrix L represents a target background, and the sparse matrix S represents the obtained candidate moving target;
s23: and carrying out noise filtering processing on the obtained candidate moving target by utilizing morphological opening and closing operation, and filtering fine noise points in a moving candidate area.
4. The unmanned aerial vehicle small target detection method based on the motion characteristic and the deep learning characteristic as claimed in claim 1, characterized in that: step S6 specifically includes:
s61: the center position (x) of the target of the previous frame of the current frame t is knownt-1,yt-1) For step S5 trainingAcquiring an improved candidate region generation network model, performing sparsification on a convolution feature map array acquired by the last three layers of convolution layers of the improved candidate region generation network model, and extracting depth features of a target by using the sparsified feature map;
s62: respectively constructing correlation filters for the output characteristics of the last three convolution layers of the improved candidate area generation network model, convolving the characteristics of each layer with the corresponding correlation filter from back to front, and calculating the corresponding confidence fraction f to obtain the new central position (x) of the candidate target in the current framet,yt);
S63: extracting depth features around the new center position, and updating parameters of the relevant filter;
s64: and in consideration of the stability and continuity of the target motion of the unmanned aerial vehicle, filtering the candidate target area track with the tracking frame number less than the threshold value, and finally obtaining the tracking target as the detection result of the unmanned aerial vehicle.
5. The unmanned aerial vehicle small target detection method based on the motion characteristic and the deep learning characteristic as claimed in claim 4, wherein: the steps of constructing the correlation filter are as follows:
s621: and (3) setting the size of the output feature as M multiplied by N multiplied by D and the depth feature as x, constructing an objective function of the correlation filter:
Figure FDA0002898007240000031
wherein, w*Is an objective function of the correlation filter, w is the correlation filter, xM,NFor the feature at the (m, n) pixel, λ is the regularization parameter λ (λ ≧ 0), y (m, n) denotes the label of the pixel at (m, n);
y (m, n) obeys a two-dimensional gaussian distribution:
Figure FDA0002898007240000032
wherein σ is the width of the Gaussian kernel;
s622: converting the objective function into a frequency domain by using fast Fourier transform to obtain an optimal solution of the objective function,
Figure FDA0002898007240000033
fourier transform of Y indicates a Hadamard product, WdIs the optimal solution of the objective function,
Figure FDA0002898007240000034
taking the Fourier transform of the depth feature x, wherein i is the ith channel, D is the channel sequence, and D belongs to {1,2, …, D };
s623: given a candidate target region of the next frame image, for the depth feature z of the candidate region, the response map corresponding to the correlation filter is:
Figure FDA0002898007240000035
wherein F-1Which represents the fourier transform of the signal,
Figure FDA0002898007240000041
a fourier transform representing the depth feature z.
6. The unmanned aerial vehicle small target detection method based on the motion characteristic and the deep learning characteristic as claimed in claim 5, wherein: the parameters for updating the correlation filter in step S63 satisfy:
Figure FDA0002898007240000042
Figure FDA0002898007240000043
Figure FDA0002898007240000044
PT、QTis an intermediate variable, WTIs the target function of the correlation filter of the updated t-th frame, t is the video frame number, and eta is the learning rate.
CN201711166232.4A 2017-11-21 2017-11-21 Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics Active CN107862705B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711166232.4A CN107862705B (en) 2017-11-21 2017-11-21 Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711166232.4A CN107862705B (en) 2017-11-21 2017-11-21 Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics

Publications (2)

Publication Number Publication Date
CN107862705A CN107862705A (en) 2018-03-30
CN107862705B true CN107862705B (en) 2021-03-30

Family

ID=61702397

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711166232.4A Active CN107862705B (en) 2017-11-21 2017-11-21 Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics

Country Status (1)

Country Link
CN (1) CN107862705B (en)

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549876A (en) * 2018-04-20 2018-09-18 重庆邮电大学 The sitting posture detecting method estimated based on target detection and human body attitude
CN110706193A (en) * 2018-06-21 2020-01-17 北京京东尚科信息技术有限公司 Image processing method and device
CN110633597B (en) * 2018-06-21 2022-09-30 北京京东尚科信息技术有限公司 Drivable region detection method and device
CN108846522B (en) * 2018-07-11 2022-02-11 重庆邮电大学 Unmanned aerial vehicle system combined charging station deployment and routing method
CN109255286B (en) * 2018-07-21 2021-08-24 哈尔滨工业大学 Unmanned aerial vehicle optical rapid detection and identification method based on deep learning network framework
CN108960190B (en) * 2018-07-23 2021-11-30 西安电子科技大学 SAR video target detection method based on FCN image sequence model
CN109272530B (en) 2018-08-08 2020-07-21 北京航空航天大学 Target tracking method and device for space-based monitoring scene
CN109325407B (en) * 2018-08-14 2020-10-09 西安电子科技大学 Optical remote sensing video target detection method based on F-SSD network filtering
CN109145906B (en) * 2018-08-31 2020-04-24 北京字节跳动网络技术有限公司 Target object image determination method, device, equipment and storage medium
CN109325967B (en) * 2018-09-14 2023-04-07 腾讯科技(深圳)有限公司 Target tracking method, device, medium, and apparatus
CN109472191B (en) * 2018-09-17 2020-08-11 西安电子科技大学 Pedestrian re-identification and tracking method based on space-time context
CN109359545B (en) * 2018-09-19 2020-07-21 北京航空航天大学 Cooperative monitoring method and device under complex low-altitude environment
CN109325490B (en) * 2018-09-30 2021-04-27 西安电子科技大学 Terahertz image target identification method based on deep learning and RPCA
CN111127509B (en) * 2018-10-31 2023-09-01 杭州海康威视数字技术股份有限公司 Target tracking method, apparatus and computer readable storage medium
CN109410149B (en) * 2018-11-08 2019-12-31 安徽理工大学 CNN denoising method based on parallel feature extraction
CN109708659B (en) * 2018-12-25 2021-02-09 四川九洲空管科技有限责任公司 Distributed intelligent photoelectric low-altitude protection system
CN109801317A (en) * 2018-12-29 2019-05-24 天津大学 The image matching method of feature extraction is carried out based on convolutional neural networks
CN109918988A (en) * 2018-12-30 2019-06-21 中国科学院软件研究所 A kind of transplantable unmanned plane detection system of combination imaging emulation technology
CN109859241B (en) * 2019-01-09 2020-09-18 厦门大学 Adaptive feature selection and time consistency robust correlation filtering visual tracking method
CN110287955B (en) * 2019-06-05 2021-06-22 北京字节跳动网络技术有限公司 Target area determination model training method, device and computer readable storage medium
CN110262529B (en) * 2019-06-13 2022-06-03 桂林电子科技大学 Unmanned aerial vehicle monitoring method and system based on convolutional neural network
CN110414375B (en) * 2019-07-08 2020-07-17 北京国卫星通科技有限公司 Low-altitude target identification method and device, storage medium and electronic equipment
CN110706252B (en) * 2019-09-09 2020-10-23 西安理工大学 Robot nuclear correlation filtering tracking algorithm under guidance of motion model
CN110631588B (en) * 2019-09-23 2022-11-18 电子科技大学 Unmanned aerial vehicle visual navigation positioning method based on RBF network
CN111006669B (en) * 2019-12-12 2022-08-02 重庆邮电大学 Unmanned aerial vehicle system task cooperation and path planning method
CN111247526B (en) * 2020-01-02 2023-05-02 香港应用科技研究院有限公司 Method and system for tracking position and direction of target object moving on two-dimensional plane
CN111242974B (en) * 2020-01-07 2023-04-11 重庆邮电大学 Vehicle real-time tracking method based on twin network and back propagation
CN111508002B (en) * 2020-04-20 2020-12-25 北京理工大学 Small-sized low-flying target visual detection tracking system and method thereof
CN111781599B (en) * 2020-07-16 2021-08-13 哈尔滨工业大学 SAR moving ship target speed estimation method based on CV-EstNet
CN112288655B (en) * 2020-11-09 2022-11-01 南京理工大学 Sea surface image stabilization method based on MSER region matching and low-rank matrix decomposition
CN112487892B (en) * 2020-11-17 2022-12-02 中国人民解放军军事科学院国防科技创新研究院 Unmanned aerial vehicle ground detection method and system based on confidence
CN114511793B (en) * 2020-11-17 2024-04-05 中国人民解放军军事科学院国防科技创新研究院 Unmanned aerial vehicle ground detection method and system based on synchronous detection tracking
CN116954264B (en) * 2023-09-08 2024-03-15 杭州牧星科技有限公司 Distributed high subsonic unmanned aerial vehicle cluster control system and method thereof
CN117079196B (en) * 2023-10-16 2023-12-29 长沙北斗产业安全技术研究院股份有限公司 Unmanned aerial vehicle identification method based on deep learning and target motion trail

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991396A (en) * 2017-04-01 2017-07-28 南京云创大数据科技股份有限公司 A kind of target relay track algorithm based on wisdom street lamp companion

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991396A (en) * 2017-04-01 2017-07-28 南京云创大数据科技股份有限公司 A kind of target relay track algorithm based on wisdom street lamp companion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"基于奇异值分解的红外弱小目标检测";田超等;《工程数学学报》;20150225;第32卷(第1期);全文 *
"森林背景下基于自适应区域生长法的烟雾检测";张炜程等;《重庆邮电大学学报( 自然科学版)》;20160225;第28 卷(第1期);全文 *
"自然场景文本区域定位";黄晓明等;《重庆邮电大学学报( 自然科学版)》;20171030;第27卷(第5期);全文 *

Also Published As

Publication number Publication date
CN107862705A (en) 2018-03-30

Similar Documents

Publication Publication Date Title
CN107862705B (en) Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics
CN110163110B (en) Pedestrian re-recognition method based on transfer learning and depth feature fusion
CN106802113B (en) Intelligent hit telling system and method based on many shell hole algorithm for pattern recognitions
Combinido et al. A convolutional neural network approach for estimating tropical cyclone intensity using satellite-based infrared images
CN103679674B (en) Method and system for splicing images of unmanned aircrafts in real time
CN104899866B (en) A kind of intelligentized infrared small target detection method
US10558186B2 (en) Detection of drones
CN103235830A (en) Unmanned aerial vehicle (UAV)-based electric power line patrol method and device and UAV
CN108875754B (en) Vehicle re-identification method based on multi-depth feature fusion network
CN103218831A (en) Video moving target classification and identification method based on outline constraint
CN108537122A (en) Image fusion acquisition system containing meteorological parameters and image storage method
CN110765948A (en) Target detection and identification method and system based on unmanned aerial vehicle
US10706516B2 (en) Image processing using histograms
Lin et al. Application research of neural network in vehicle target recognition and classification
CN116109950A (en) Low-airspace anti-unmanned aerial vehicle visual detection, identification and tracking method
CN110567324A (en) multi-target group threat degree prediction device and method based on DS evidence theory
Biswas et al. Small object difficulty (sod) modeling for objects detection in satellite images
CN115508821A (en) Multisource fuses unmanned aerial vehicle intelligent detection system
CN115700808A (en) Dual-mode unmanned aerial vehicle identification method for adaptively fusing visible light and infrared images
Valappil et al. CNN-SVM based vehicle detection for UAV platform
CN110458064B (en) Low-altitude target detection and identification method combining data driving type and knowledge driving type
Mohammed et al. Radio frequency fingerprint-based drone identification and classification using Mel spectrograms and pre-trained YAMNet neural
CN117911822A (en) Multi-sensor fusion unmanned aerial vehicle target detection method, system and application
CN113297982A (en) Target detection method for improving combination of KCF and DSST in aerial photography
CN113269099A (en) Vehicle re-identification method under heterogeneous unmanned system based on graph matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant