CN107862705B - Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics - Google Patents
Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics Download PDFInfo
- Publication number
- CN107862705B CN107862705B CN201711166232.4A CN201711166232A CN107862705B CN 107862705 B CN107862705 B CN 107862705B CN 201711166232 A CN201711166232 A CN 201711166232A CN 107862705 B CN107862705 B CN 107862705B
- Authority
- CN
- China
- Prior art keywords
- candidate
- target
- unmanned aerial
- aerial vehicle
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to an unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics, and belongs to the technical field of image processing and computer vision. Firstly, processing an input video data set through a video image stabilization algorithm to compensate the motion of a camera; analyzing a detected motion candidate target area in the image; dividing a video data set into two parts, and training by utilizing a training data set to obtain an improved candidate area generation network model; generating a network through the candidate region based on the depth features obtained through training, and generating candidate targets for the video images of the test set through the network; fusing the candidate target areas; and training by utilizing a training data set to obtain a model of the deep neural network based on the double channels, and applying the model to obtain a recognition result. And applying a target tracking method based on the multilayer depth characteristics to the recognition result of the previous step to obtain the final position of the unmanned aerial vehicle. The invention can accurately detect the unmanned aerial vehicle in the video image and provides support for the subsequent research of the unmanned aerial vehicle intelligent monitoring related field.
Description
Technical Field
The invention belongs to the technical field of image processing and computer vision, and relates to an unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics.
Background
At present, with the rapid increase of the availability and maturity of commercial drones, the sale of drones is multiplied, and drones flying in public areas are common. Unmanned aerial vehicle not only appears in the camera lens of the popular comprehensive skill, on the romantic wedding ceremony, can also spray the pesticide in the sky above the farmland, replace the workman to carry out high altitude cleaning operation for survey and drawing shooting, forest fire prevention, military reconnaissance etc.. However, with the rapid development of unmanned aerial vehicles, the dangerous accidents caused by the unmanned aerial vehicles are also growing, and threats are brought to public safety, privacy disclosure, military safety and the like.
In recent years, techniques for detecting an unmanned aerial vehicle are roughly classified into sound Detection (Acoustics Detection), Radio Frequency Detection (Radio Frequency), Radar Detection (Radar Detection), Visual Detection (Visual Detection), and the like. Sound detection uses the microphone array to detect the rotor noise that unmanned aerial vehicle flies, then matches the noise that detects with the database that has recorded all unmanned aerial vehicle sounds, thereby discerns whether this noise belongs to unmanned aerial vehicle and judges whether there is unmanned aerial vehicle to be close to. The method for detecting the sound is easily interfered by environmental noise, and meanwhile, the time is consumed for constructing a database of the sound characteristics of the unmanned aerial vehicle. Radio frequency detection is to detect the drone by monitoring radio frequencies within a certain frequency range through a wireless receiver. This approach easily misinformates an unknown radio frequency transmitter as a drone. The radar detection is to judge whether the unmanned aerial vehicle is detected by detecting and verifying received electromagnetic waves scattered and reflected by a target. The cost and energy consumption of radar equipment are expensive and susceptible to environmental influences that create blind spots. Visual inspection typically detects drones by one or more imaging devices, and analyzes the sequence of images in some way to determine if a drone is present. Unmanned aerial vehicle detection based on vision is difficult for receiving ambient noise's interference, can fix a position unmanned aerial vehicle position, can also distinguish whether unmanned aerial vehicle carries dangerous goods, can detect information such as unmanned aerial vehicle's flight orbit, flying speed even. Therefore, the visual detection method has great advantages compared with other means, and can make up for the defects of other detection means.
At present, the research work of unmanned aerial vehicle detection based on vision is less. Obviously, it is more advantageous to avoid the danger of the drone in advance to detect the drone at a longer distance. Unmanned aerial vehicles compare target sizes such as pedestrian, aircraft, vehicle littleer, especially in remote imaging, unmanned aerial vehicle's size is very little, and this makes unmanned aerial vehicle detection based on vision more difficult. Therefore, a detection algorithm capable of effectively detecting the unmanned aerial vehicle small target in the video is needed at present.
Disclosure of Invention
In view of the above, the present invention aims to provide a method for detecting a small target of an unmanned aerial vehicle based on motion characteristics and deep learning characteristics, which utilizes a target tracking algorithm to track the unmanned aerial vehicle and filter out false targets, and combines the characteristics of small size of the unmanned aerial vehicle, etc. to improve a convolutional neural network structure, so that the deep learning algorithm is suitable for the situation of the small target, and can effectively detect the unmanned aerial vehicle in a complex scene, thereby improving the accuracy of unmanned aerial vehicle detection.
In order to achieve the purpose, the invention provides the following technical scheme:
an unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics comprises the following steps:
s1: processing the input video data set through a video image stabilization algorithm to compensate the motion of a camera;
s2: detecting a motion candidate target area I from the video image after motion compensation by using a low-rank matrix analysis method, and removing tiny noise points in the motion candidate target area I by using an image post-processing module;
s3: dividing a data set of a video into a training set and a testing set, and training by using the training set to obtain an improved candidate region generation network model; processing the video image of the test set through the improved area generation network model to generate a candidate target area II;
s4: fusing the candidate target area I and the candidate target area II to obtain a candidate target area III;
s5: according to the candidate target area III, training by using a training set to obtain a dual-channel-based deep neural network, and then applying the dual-channel-based deep neural network to the candidate target of the test set to obtain a recognition result;
s6: and predicting the position of the target by using a correlation filtering algorithm, tracking and matching the stable target, filtering out false targets and obtaining the position of the unmanned aerial vehicle.
Further, in step S1, the video image stabilization algorithm includes:
s11: extracting characteristic points of each frame image by using an SURF algorithm;
s12: calculating an affine transformation model between two frames through the obtained feature matching points between the two frames of images;
s13: and compensating the current frame by using the obtained affine transformation model.
Further, in step S2, the process of detecting the motion candidate target region i by using the low rank matrix analysis method includes the following steps:
s21: will input video sequence image data f1,f2,...,fnVectorization to form an image matrixWhere n is the number of video frames, fnFor the matrix of video images of the n-th frame,is fnVectorizing the image matrix;
s22: decomposing the matrix C into a low-rank matrix L and a sparse matrix S through an RPCA algorithm, wherein the obtained low-rank matrix L represents a target background, and the sparse matrix S represents the obtained candidate moving target;
s23: and carrying out noise filtering processing on the obtained candidate moving target by utilizing morphological opening and closing operation, and filtering fine noise points in a moving candidate area.
Further, in step S3, the improved candidate area generation network model includes five convolutional layers and two fully-connected layers connected in sequence, wherein pooling layers are disposed between the first convolutional layer and the second convolutional layer, between the second convolutional layer and the third convolutional layer, and between the fifth layer and the first fully-connected layer;
step S3 specifically includes:
s31: dividing a data set of a video into a training set and a testing set;
s32: for the data of the training set, extracting manually marked positive samples in the image, and then randomly sampling a plurality of areas as negative samples;
s33: training by using positive and negative samples of a training set to obtain an improved candidate area generation network model;
s34: and processing the video image of the test set through the improved area generation network model to generate a candidate target area II.
Further, the width and height size ranges of the randomly sampled region in step S32 are determined by the width and height of the positive sample, and the overlapping region of the negative sample and the positive sample satisfies:
where IoU is the overlap ratio, rgIs a positive sample region, rnThe negative sample regions are randomly sampled.
Further, the step S4 of obtaining the candidate target region iii by fusion specifically includes:
s41: carrying out dense sampling on the candidate target area I to obtain a dense seed candidate area;
s42: calculating the similarity between the dense seed candidate region and the candidate target region II when the similarity meets the requirement
Combining two candidate regions, wherein Sim is the similarity between the dense seed candidate region and the candidate target region II;
s43: and traversing all the candidate target areas I to obtain a final candidate target area III.
Further, in step S5, the dual-channel-based deep neural network includes a front-end module and a back-end module;
the front-end module consists of two parallel deep neural network models, wherein one of the models takes a candidate target area as input directly and passes through a 6-layer convolutional neural network and 1 full-connection layer; the other one takes the candidate target area as the center, establishes an expansion area on the original image target area as input, and passes through a 6-layer convolutional neural network and 1 full-connection layer;
the rear-end module takes the output of the two full-connection layers obtained by the front-end module as input, and obtains the classification information of each candidate area as a final classification result through 2 full-connection layers and 1 softmax layer;
step S5 specifically includes:
s51: for the training data set, dividing the training data set of the candidate target area III obtained in the step S4 into positive and negative samples, and inputting the positive and negative samples into a two-channel-based deep neural network for training to obtain optimal weight;
s52: and applying the optimal weight to the candidate target areas of the test set obtained in the step S4 for classification, so as to obtain a final recognition result.
Further, step S6 specifically includes:
s61: the center position (x) of the target of the previous frame of the current frame t is knownt-1,yt-1) For the improved candidate region generation network model obtained by training in step S5, performing sparsification on the convolution feature map array obtained by the last three convolution layers of the improved candidate region generation network model, and then extracting the depth feature of the target by using the sparsified feature map;
s62: respectively constructing correlation filters for the output characteristics of the last three convolution layers of the improved candidate area generation network model, convolving the characteristics of each layer with the corresponding correlation filter from back to front, and calculating the corresponding confidence fraction f to obtain the new central position (x) of the candidate target in the current framet,yt);
S63: extracting depth features around the new center position, and updating parameters of the relevant filter;
s64: and in consideration of the stability and continuity of the target motion of the unmanned aerial vehicle, filtering the candidate target area track with the tracking frame number less than the threshold value, and finally obtaining the tracking target as the detection result of the unmanned aerial vehicle.
Further, the step of constructing the correlation filter is:
s621: and (3) setting the size of the output feature as M multiplied by N multiplied by D and the depth feature as x, constructing an objective function of the correlation filter:
wherein, w*Is an objective function of the correlation filter, w is the correlation filter, xm,nFor the feature at the (m, n) pixel, λ is the regularization parameter λ (λ ≧ 0), y (m, n) denotes the label of the pixel at (m, n);
y (m, n) obeys a two-dimensional gaussian distribution:
wherein σ is the width of the Gaussian kernel;
s622: converting the objective function into a frequency domain by using fast Fourier transform to obtain an optimal solution of the objective function,
fourier transform of Y indicates a Hadamard product, WdIs the optimal solution of the objective function,taking the Fourier transform of the depth feature x, wherein i is the ith channel, D is the channel sequence, and D belongs to {1,2, …, D };
s623: given a candidate target region of the next frame image, for the depth feature z of the candidate region, the response map corresponding to the correlation filter is:
wherein F-1Which represents the fourier transform of the signal,a fourier transform representing the depth feature z.
Further, the updating of the parameters of the correlation filter in step S63 satisfies:
Pt、Qtis an intermediate variable, WtIs the target function of the correlation filter of the updated t-th frame, t is the video frame number, and eta is the learning rate.
The invention has the beneficial effects that:
1) the invention provides a method for detecting an unmanned aerial vehicle based on the motion characteristics and deep learning characteristics of the unmanned aerial vehicle. The method can effectively detect the target under the conditions that the background is complex and the unmanned aerial vehicle is small.
2) The method improves the traditional deep neural network structure, and effectively solves the problem that the existing target detection algorithm based on the deep neural network is not suitable for small targets.
3) The method provides an online tracking algorithm based on multilayer depth features and related filters, and can better track and predict the trajectory of the unmanned aerial vehicle and filter false targets.
Drawings
In order to make the object, technical scheme and beneficial effect of the invention more clear, the invention provides the following drawings for explanation:
FIG. 1 is a schematic diagram of a method for detecting a small target of an unmanned aerial vehicle based on motion characteristics and deep learning characteristics according to the invention;
FIG. 2 is a schematic diagram of a video image stabilization algorithm;
FIG. 3 is a schematic diagram of a convolutional neural network structure;
FIG. 4 is a schematic diagram of candidate object generation using an improved area generation network;
FIG. 5 is a schematic diagram of a dual channel-based deep neural network;
FIG. 6 is a schematic diagram of an online tracking algorithm based on depth features.
Detailed Description
Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
In the invention, a candidate target detection module based on motion characteristics extracts a motion target area in a video through low-rank matrix analysis after video stabilization is carried out on an original video;
a candidate target detection module based on the depth characteristics extracts candidate targets from the video image through an improved area generation network model;
the improved candidate area generation network is characterized in that the network structure and the size of the candidate area are modified on the basis of the traditional area generation network, and the network layer of the output characteristic diagram is replaced;
the candidate region fusion module is used for fusing the candidate regions obtained in the steps S2 and S3;
the candidate target identification module based on the dual-channel deep neural network is used for improving a traditional deep neural network model to classify and identify candidate areas according to the characteristics of small targets to obtain a final identification result;
the depth feature-based online tracking algorithm improves the traditional artificial feature-based tracking algorithm, utilizes the features of the target extracted by the convolutional neural network, and has robustness.
Fig. 1 is a schematic diagram of a method for detecting a small target of an unmanned aerial vehicle based on motion characteristics and deep learning characteristics, and as shown in the figure, the method specifically includes the following steps:
step S1: firstly, processing a data set of an input original video through a video image stabilization algorithm to compensate camera motion, wherein a specific flow chart is shown in fig. 2:
s101: and extracting key points from the image by using an SURF algorithm, and constructing a SURF feature point descriptor.
S102: and calculating Euclidean distances of the corresponding feature points of the two frames of images, then selecting the minimum distance, setting a threshold value, keeping the matching points when the distance of the corresponding feature points is smaller than the threshold value, and otherwise, removing the matching points.
S103: the two frames of images are subjected to bidirectional matching, and by repeating the step S12, when the matched feature point pair coincides with the result obtained in the step S12, the final feature matching point is obtained.
S104: and setting the camera motion as an affine transformation model, and calculating by using a least square method according to the feature matching points obtained in the step to obtain the affine transformation model between the two frames of images.
S105: and according to the obtained affine transformation model, registering the current frame and the set reference frame to obtain the compensated current frame, storing the compensated current frame into a new video, and finally obtaining the stable video.
S106: and calculating the offset degree between the compensated current frame and the reference frame, if the offset degree is greater than a threshold value, updating the current frame to be the reference frame, and otherwise, continuously reading the next frame.
Step S2: and detecting a motion candidate target region of the compensated video by a low-rank matrix analysis method, and removing fine noise points in the motion candidate region by an image post-processing module.
Step S3: dividing a data set into a training set and a testing set, and training by using the training data set to obtain an improved candidate area generation network model; candidate targets are generated for the video images of the test set by the trained improved region generation network. The improved area generation network model structure is shown in fig. 4:
in step S3, training the improved candidate area generation network model using the positive and negative samples of the data set; generating candidate targets for the video images of the test set through the trained improved area generation network, specifically comprising the following processes:
firstly, aiming at the characteristics of an unmanned aerial vehicle, improving the traditional candidate area generation network structure to obtain an improved candidate area generation network, wherein the improved candidate area generation network modifies the network structure and the scale size of characteristic extraction, and replaces the network layer of an output characteristic diagram; then training by utilizing a training data set to obtain an improved candidate region to generate the optimal weight of the network model; and finally, applying the optimal weight to the test data set to obtain a candidate target rectangular frame.
The improved area generation Network is mainly additionally provided with two full convolution layers on the basis of a Convolutional Neural Network (CNN), wherein one full convolution layer is an area classification layer and used for judging whether a candidate area is a foreground target or a background, and the other full convolution layer is an area frame regression layer and used for predicting the position coordinates of the candidate area. The convolutional neural network is composed of five convolutional layers, three pooling layers and two fully-connected layers, as shown in fig. 3. The traditional area generation network usually processes the feature map generated by the last convolutional layer, but the small target often depends more on the shallow feature because the shallow feature has higher resolution, so the method changes the shallow feature into the fourth convolutional layer. And the area generation network slides on the feature map output by the fourth convolution layer through a sliding network, the sliding network is fully connected with 9 windows with different scales on the feature map each time, then the low-dimensional vector is mapped, and finally the low-dimensional vector is sent to two fully-connected layers to obtain the category and the position of the candidate target. Compared with the traditional area generation network, the method has the advantage that the size of 9 scales is reduced compared with the original size, so that the method is more beneficial to the detection of small targets.
Step S3 specifically includes:
s31: dividing a data set of a video into a training set and a testing set;
s32: for the data of the training set, extracting manually marked positive samples in the image, and then randomly sampling a plurality of areas as negative samples;
s33: training by using positive and negative samples of a training set to obtain an improved candidate area generation network model;
s34: and processing the video image of the test set through the improved area generation network model to generate a candidate target area II.
In step S32, the image is sampled by negative samples, the width and height of the sampled region are determined by the maximum (minimum) width and height of the positive samples, and the overlap ratio of the region of the negative samples and the positive samples cannot exceed the following conditions:
where IoU is the overlap ratio, rgIs a positive sample region, rnThe negative sample regions are randomly sampled.
Step S4: and performing dense sampling on the candidate target area obtained in the step S2 to obtain a denser candidate target frame, and then obtaining a final candidate target by fusing the dense candidate target frame with the candidate target obtained in the step S3.
The specific fusion mode comprises the following steps:
s41: taking the motion candidate region obtained in the step S2 as a seed candidate region, and performing further dense sampling on the seed candidate region to obtain a dense seed candidate region;
s42: calculating the similarity between the seed candidate region and the candidate region obtained in step S3, when the similarity is larger than mu (mu epsilon [0.6, 1)]) And meanwhile, combining the two candidate regions, and traversing all the seed candidate regions to obtain a final candidate region. The similarity Sim calculation formula of the area A and the area B is as follows:
step S5: according to the deep neural network model based on the double channels and aiming at the small target detection, the network model is obtained by training through a training data set, and then the network model is applied to candidate targets of a test set to obtain a recognition result. The structure of the deep neural network model based on the dual channels is shown in fig. 5:
the deep neural network model based on the double channels mainly comprises two parts, namely a front-end module and a rear-end module. The front-end module consists of two parallel deep neural network models, one of which takes a candidate target area as input directly and obtains 4096-dimensional characteristics through 6 convolution layers and 1 full-connection layer; and the other one takes an extended area of the target area which is 4 times that of the candidate target area as the center on the original drawing as input, and obtains 4096-dimensional characteristics through 6 convolutional layers and 1 full-connected layer. The back end module is used for inputting two 4096 characteristics obtained by the front end module in a string mode, and classification information of each candidate area is obtained through 2 full connection layers and 1 softmax layer to serve as a final classification result.
In step S5, according to the two-channel-based deep neural network for small target detection proposed by the method, the network model is obtained by training with a training data set, and then the network model is applied to candidate targets in a test set to obtain a recognition result, which specifically includes:
s51: for the training data set, dividing the candidate target region of the training data set obtained in the step S4 into positive and negative samples, and inputting the improved two-channel-based deep neural network training to obtain the optimal weight.
S52: and applying the optimal weight to the candidate target area of the test data set obtained in the step S4 for classification, so as to obtain a final recognition result.
Step S6: the target tracking method based on the depth features, which is provided by the method, is applied to the recognition result of the step S5, the position of the target is predicted by using a relevant filtering algorithm, and the matched stable target is tracked, so that the false target is filtered, and the final position of the unmanned aerial vehicle is obtained. The specific flow chart of the target tracking algorithm based on the depth features is shown in fig. 6:
s601: inputting a candidate target area of a previous frame of the current frame, firstly thinning a convolution feature map array obtained by the last three layers of convolution layers of the model by using the neural network model obtained by training in the step S5, and then extracting the depth feature of the target by using the thinned feature map;
s602: constructing corresponding correlation filters for the output characteristics of each convolution layer, performing convolution on the characteristics of each layer and the corresponding correlation filters from back to front, and calculating corresponding confidence scores to obtain the new position of the candidate target in the current frame;
s603: depth features are extracted around the new center position of the candidate object to update the parameters of the correlation filter.
S604: and in consideration of the stability and continuity of the target motion of the unmanned aerial vehicle, filtering the candidate target area track with the tracking frame number less than the threshold value, and finally obtaining the tracking target as the detection result of the unmanned aerial vehicle.
The threshold mentioned in step S604 has a value in the range of 5 to 20.
In step S6, the constructing a corresponding correlation filter for each of the M × N × D output features specifically includes:
firstly, let the depth feature with size M × N × D be x, and construct the objective function of the corresponding correlation filter as:
wherein, lambda (lambda is more than or equal to 0) is a regularization parameter; y (m, n) represents the label of the pixel at (m, n), which follows a two-dimensional gaussian distribution:
then, the objective function is converted into a frequency domain by using fast Fourier transform, and the optimal solution of the objective function can be derived as follows:
finally, after a candidate target region of the next frame image is given, for the depth feature z of the candidate region, the response map corresponding to the correlation filter is:
Further, in step S6, the parameter W of the correlation filterdThe updating policy specifically includes:
where t is the video frame number and η is the learning rate.
Finally, it is noted that the above-mentioned preferred embodiments illustrate rather than limit the invention, and that, although the invention has been described in detail with reference to the above-mentioned preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the scope of the invention as defined by the appended claims.
Claims (6)
1. A small target detection method of an unmanned aerial vehicle based on motion characteristics and deep learning characteristics is characterized in that: the method comprises the following steps:
s1: processing the input video data set through a video image stabilization algorithm to compensate the motion of a camera;
s2: detecting a motion candidate target area I from the video image after motion compensation by using a low-rank matrix analysis method, and removing tiny noise points in the motion candidate target area I by using an image post-processing module;
s3: dividing a data set of a video into a training set and a testing set, and training by using the training set to obtain an improved candidate region generation network model; processing the video image of the test set through the improved area generation network model to generate a candidate target area II;
s4: fusing the candidate target area I and the candidate target area II to obtain a candidate target area III;
s5: according to the candidate target area III, training by using a training set to obtain a dual-channel-based deep neural network, and then applying the dual-channel-based deep neural network to the candidate target of the test set to obtain a recognition result;
s6: predicting the position of a target by using a relevant filtering algorithm, tracking and matching the stable target, and filtering out false targets to obtain the position of the unmanned aerial vehicle;
in step S3, the improved candidate area generation network model includes five convolutional layers and two fully-connected layers connected in sequence, wherein pooling layers are disposed between the first convolutional layer and the second convolutional layer, between the second convolutional layer and the third convolutional layer, and between the fifth layer and the first fully-connected layer;
step S3 specifically includes:
s31: dividing a data set of a video into a training set and a testing set;
s32: for the data of the training set, extracting manually marked positive samples in the image, and then randomly sampling a plurality of areas as negative samples;
s33: training by using positive and negative samples of a training set to obtain an improved candidate area generation network model;
s34: processing the video image of the test set through the improved area generation network model to generate a candidate target area II;
the width and height size range of the randomly sampled region in step S32 is determined by the width and height of the positive sample, and the overlapping region of the negative sample and the positive sample satisfies:
where IoU is the overlap ratio, rgIs a positive sample region, rnRandomly sampling a negative sample region;
the step S4 of obtaining the candidate target region iii by fusion specifically includes:
s41: carrying out dense sampling on the candidate target area I to obtain a dense seed candidate area;
s42: calculating the similarity between the dense seed candidate region and the candidate target region II when the similarity meets the requirement
Combining two candidate regions, wherein Sim is the similarity between the dense seed candidate region and the candidate target region II;
s43: traversing all the candidate target areas I to obtain a final candidate target area III;
in step S5, the dual-channel-based deep neural network includes a front-end module and a back-end module;
the front-end module consists of two parallel deep neural network models, wherein one of the models takes a candidate target area as input directly and passes through a 6-layer convolutional neural network and 1 full-connection layer; the other one takes the candidate target area as the center, establishes an expansion area on the original image target area as input, and passes through a 6-layer convolutional neural network and 1 full-connection layer;
the rear-end module takes the output of the two full-connection layers obtained by the front-end module as input, and obtains the classification information of each candidate area as a final classification result through 2 full-connection layers and 1 softmax layer;
step S5 specifically includes:
s51: for the training data set, dividing the training data set of the candidate target area III obtained in the step S4 into positive and negative samples, and inputting the positive and negative samples into a two-channel-based deep neural network for training to obtain optimal weight;
s52: and applying the optimal weight to the candidate target areas of the test set obtained in the step S4 for classification, so as to obtain a final recognition result.
2. The unmanned aerial vehicle small target detection method based on the motion characteristic and the deep learning characteristic as claimed in claim 1, characterized in that: in step S1, the video image stabilization algorithm includes:
s11: extracting characteristic points of each frame image by using an SURF algorithm;
s12: calculating an affine transformation model between two frames through the obtained feature matching points between the two frames of images;
s13: and compensating the current frame by using the obtained affine transformation model.
3. The unmanned aerial vehicle small target detection method based on the motion characteristic and the deep learning characteristic as claimed in claim 1, characterized in that: in step S2, the process of detecting the motion candidate target region i by using the low rank matrix analysis method includes the following steps:
s21: will input video sequence image data f1,f2,...,fnVectorization to form an image matrixWhere n is the number of video frames, fnFor the matrix of video images of the n-th frame,is fnVectorizing the image matrix;
s22: decomposing the matrix C into a low-rank matrix L and a sparse matrix S through an RPCA algorithm, wherein the obtained low-rank matrix L represents a target background, and the sparse matrix S represents the obtained candidate moving target;
s23: and carrying out noise filtering processing on the obtained candidate moving target by utilizing morphological opening and closing operation, and filtering fine noise points in a moving candidate area.
4. The unmanned aerial vehicle small target detection method based on the motion characteristic and the deep learning characteristic as claimed in claim 1, characterized in that: step S6 specifically includes:
s61: the center position (x) of the target of the previous frame of the current frame t is knownt-1,yt-1) For step S5 trainingAcquiring an improved candidate region generation network model, performing sparsification on a convolution feature map array acquired by the last three layers of convolution layers of the improved candidate region generation network model, and extracting depth features of a target by using the sparsified feature map;
s62: respectively constructing correlation filters for the output characteristics of the last three convolution layers of the improved candidate area generation network model, convolving the characteristics of each layer with the corresponding correlation filter from back to front, and calculating the corresponding confidence fraction f to obtain the new central position (x) of the candidate target in the current framet,yt);
S63: extracting depth features around the new center position, and updating parameters of the relevant filter;
s64: and in consideration of the stability and continuity of the target motion of the unmanned aerial vehicle, filtering the candidate target area track with the tracking frame number less than the threshold value, and finally obtaining the tracking target as the detection result of the unmanned aerial vehicle.
5. The unmanned aerial vehicle small target detection method based on the motion characteristic and the deep learning characteristic as claimed in claim 4, wherein: the steps of constructing the correlation filter are as follows:
s621: and (3) setting the size of the output feature as M multiplied by N multiplied by D and the depth feature as x, constructing an objective function of the correlation filter:
wherein, w*Is an objective function of the correlation filter, w is the correlation filter, xM,NFor the feature at the (m, n) pixel, λ is the regularization parameter λ (λ ≧ 0), y (m, n) denotes the label of the pixel at (m, n);
y (m, n) obeys a two-dimensional gaussian distribution:
wherein σ is the width of the Gaussian kernel;
s622: converting the objective function into a frequency domain by using fast Fourier transform to obtain an optimal solution of the objective function,
fourier transform of Y indicates a Hadamard product, WdIs the optimal solution of the objective function,taking the Fourier transform of the depth feature x, wherein i is the ith channel, D is the channel sequence, and D belongs to {1,2, …, D };
s623: given a candidate target region of the next frame image, for the depth feature z of the candidate region, the response map corresponding to the correlation filter is:
6. The unmanned aerial vehicle small target detection method based on the motion characteristic and the deep learning characteristic as claimed in claim 5, wherein: the parameters for updating the correlation filter in step S63 satisfy:
PT、QTis an intermediate variable, WTIs the target function of the correlation filter of the updated t-th frame, t is the video frame number, and eta is the learning rate.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711166232.4A CN107862705B (en) | 2017-11-21 | 2017-11-21 | Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711166232.4A CN107862705B (en) | 2017-11-21 | 2017-11-21 | Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107862705A CN107862705A (en) | 2018-03-30 |
CN107862705B true CN107862705B (en) | 2021-03-30 |
Family
ID=61702397
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711166232.4A Active CN107862705B (en) | 2017-11-21 | 2017-11-21 | Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107862705B (en) |
Families Citing this family (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108549876A (en) * | 2018-04-20 | 2018-09-18 | 重庆邮电大学 | The sitting posture detecting method estimated based on target detection and human body attitude |
CN110706193A (en) * | 2018-06-21 | 2020-01-17 | 北京京东尚科信息技术有限公司 | Image processing method and device |
CN110633597B (en) * | 2018-06-21 | 2022-09-30 | 北京京东尚科信息技术有限公司 | Drivable region detection method and device |
CN108846522B (en) * | 2018-07-11 | 2022-02-11 | 重庆邮电大学 | Unmanned aerial vehicle system combined charging station deployment and routing method |
CN109255286B (en) * | 2018-07-21 | 2021-08-24 | 哈尔滨工业大学 | Unmanned aerial vehicle optical rapid detection and identification method based on deep learning network framework |
CN108960190B (en) * | 2018-07-23 | 2021-11-30 | 西安电子科技大学 | SAR video target detection method based on FCN image sequence model |
CN109272530B (en) | 2018-08-08 | 2020-07-21 | 北京航空航天大学 | Target tracking method and device for space-based monitoring scene |
CN109325407B (en) * | 2018-08-14 | 2020-10-09 | 西安电子科技大学 | Optical remote sensing video target detection method based on F-SSD network filtering |
CN109145906B (en) * | 2018-08-31 | 2020-04-24 | 北京字节跳动网络技术有限公司 | Target object image determination method, device, equipment and storage medium |
CN109325967B (en) * | 2018-09-14 | 2023-04-07 | 腾讯科技(深圳)有限公司 | Target tracking method, device, medium, and apparatus |
CN109472191B (en) * | 2018-09-17 | 2020-08-11 | 西安电子科技大学 | Pedestrian re-identification and tracking method based on space-time context |
CN109359545B (en) * | 2018-09-19 | 2020-07-21 | 北京航空航天大学 | Cooperative monitoring method and device under complex low-altitude environment |
CN109325490B (en) * | 2018-09-30 | 2021-04-27 | 西安电子科技大学 | Terahertz image target identification method based on deep learning and RPCA |
CN111127509B (en) * | 2018-10-31 | 2023-09-01 | 杭州海康威视数字技术股份有限公司 | Target tracking method, apparatus and computer readable storage medium |
CN109410149B (en) * | 2018-11-08 | 2019-12-31 | 安徽理工大学 | CNN denoising method based on parallel feature extraction |
CN109708659B (en) * | 2018-12-25 | 2021-02-09 | 四川九洲空管科技有限责任公司 | Distributed intelligent photoelectric low-altitude protection system |
CN109801317A (en) * | 2018-12-29 | 2019-05-24 | 天津大学 | The image matching method of feature extraction is carried out based on convolutional neural networks |
CN109918988A (en) * | 2018-12-30 | 2019-06-21 | 中国科学院软件研究所 | A kind of transplantable unmanned plane detection system of combination imaging emulation technology |
CN109859241B (en) * | 2019-01-09 | 2020-09-18 | 厦门大学 | Adaptive feature selection and time consistency robust correlation filtering visual tracking method |
CN110287955B (en) * | 2019-06-05 | 2021-06-22 | 北京字节跳动网络技术有限公司 | Target area determination model training method, device and computer readable storage medium |
CN110262529B (en) * | 2019-06-13 | 2022-06-03 | 桂林电子科技大学 | Unmanned aerial vehicle monitoring method and system based on convolutional neural network |
CN110414375B (en) * | 2019-07-08 | 2020-07-17 | 北京国卫星通科技有限公司 | Low-altitude target identification method and device, storage medium and electronic equipment |
CN110706252B (en) * | 2019-09-09 | 2020-10-23 | 西安理工大学 | Robot nuclear correlation filtering tracking algorithm under guidance of motion model |
CN110631588B (en) * | 2019-09-23 | 2022-11-18 | 电子科技大学 | Unmanned aerial vehicle visual navigation positioning method based on RBF network |
CN111006669B (en) * | 2019-12-12 | 2022-08-02 | 重庆邮电大学 | Unmanned aerial vehicle system task cooperation and path planning method |
CN111247526B (en) * | 2020-01-02 | 2023-05-02 | 香港应用科技研究院有限公司 | Method and system for tracking position and direction of target object moving on two-dimensional plane |
CN111242974B (en) * | 2020-01-07 | 2023-04-11 | 重庆邮电大学 | Vehicle real-time tracking method based on twin network and back propagation |
CN111508002B (en) * | 2020-04-20 | 2020-12-25 | 北京理工大学 | Small-sized low-flying target visual detection tracking system and method thereof |
CN111781599B (en) * | 2020-07-16 | 2021-08-13 | 哈尔滨工业大学 | SAR moving ship target speed estimation method based on CV-EstNet |
CN112288655B (en) * | 2020-11-09 | 2022-11-01 | 南京理工大学 | Sea surface image stabilization method based on MSER region matching and low-rank matrix decomposition |
CN112487892B (en) * | 2020-11-17 | 2022-12-02 | 中国人民解放军军事科学院国防科技创新研究院 | Unmanned aerial vehicle ground detection method and system based on confidence |
CN114511793B (en) * | 2020-11-17 | 2024-04-05 | 中国人民解放军军事科学院国防科技创新研究院 | Unmanned aerial vehicle ground detection method and system based on synchronous detection tracking |
CN116954264B (en) * | 2023-09-08 | 2024-03-15 | 杭州牧星科技有限公司 | Distributed high subsonic unmanned aerial vehicle cluster control system and method thereof |
CN117079196B (en) * | 2023-10-16 | 2023-12-29 | 长沙北斗产业安全技术研究院股份有限公司 | Unmanned aerial vehicle identification method based on deep learning and target motion trail |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991396A (en) * | 2017-04-01 | 2017-07-28 | 南京云创大数据科技股份有限公司 | A kind of target relay track algorithm based on wisdom street lamp companion |
-
2017
- 2017-11-21 CN CN201711166232.4A patent/CN107862705B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991396A (en) * | 2017-04-01 | 2017-07-28 | 南京云创大数据科技股份有限公司 | A kind of target relay track algorithm based on wisdom street lamp companion |
Non-Patent Citations (3)
Title |
---|
"基于奇异值分解的红外弱小目标检测";田超等;《工程数学学报》;20150225;第32卷(第1期);全文 * |
"森林背景下基于自适应区域生长法的烟雾检测";张炜程等;《重庆邮电大学学报( 自然科学版)》;20160225;第28 卷(第1期);全文 * |
"自然场景文本区域定位";黄晓明等;《重庆邮电大学学报( 自然科学版)》;20171030;第27卷(第5期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN107862705A (en) | 2018-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107862705B (en) | Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics | |
CN110163110B (en) | Pedestrian re-recognition method based on transfer learning and depth feature fusion | |
CN106802113B (en) | Intelligent hit telling system and method based on many shell hole algorithm for pattern recognitions | |
Combinido et al. | A convolutional neural network approach for estimating tropical cyclone intensity using satellite-based infrared images | |
CN103679674B (en) | Method and system for splicing images of unmanned aircrafts in real time | |
CN104899866B (en) | A kind of intelligentized infrared small target detection method | |
US10558186B2 (en) | Detection of drones | |
CN103235830A (en) | Unmanned aerial vehicle (UAV)-based electric power line patrol method and device and UAV | |
CN108875754B (en) | Vehicle re-identification method based on multi-depth feature fusion network | |
CN103218831A (en) | Video moving target classification and identification method based on outline constraint | |
CN108537122A (en) | Image fusion acquisition system containing meteorological parameters and image storage method | |
CN110765948A (en) | Target detection and identification method and system based on unmanned aerial vehicle | |
US10706516B2 (en) | Image processing using histograms | |
Lin et al. | Application research of neural network in vehicle target recognition and classification | |
CN116109950A (en) | Low-airspace anti-unmanned aerial vehicle visual detection, identification and tracking method | |
CN110567324A (en) | multi-target group threat degree prediction device and method based on DS evidence theory | |
Biswas et al. | Small object difficulty (sod) modeling for objects detection in satellite images | |
CN115508821A (en) | Multisource fuses unmanned aerial vehicle intelligent detection system | |
CN115700808A (en) | Dual-mode unmanned aerial vehicle identification method for adaptively fusing visible light and infrared images | |
Valappil et al. | CNN-SVM based vehicle detection for UAV platform | |
CN110458064B (en) | Low-altitude target detection and identification method combining data driving type and knowledge driving type | |
Mohammed et al. | Radio frequency fingerprint-based drone identification and classification using Mel spectrograms and pre-trained YAMNet neural | |
CN117911822A (en) | Multi-sensor fusion unmanned aerial vehicle target detection method, system and application | |
CN113297982A (en) | Target detection method for improving combination of KCF and DSST in aerial photography | |
CN113269099A (en) | Vehicle re-identification method under heterogeneous unmanned system based on graph matching |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |