CN112487889A

CN112487889A - Unmanned aerial vehicle ground detection method and system based on deep neural network

Info

Publication number: CN112487889A
Application number: CN202011285176.8A
Authority: CN
Inventors: 管乃洋; 苏龙飞; 王之元; 凡遵林; 张天昊; 王浩; 沈天龙; 黄强娟
Original assignee: National Defense Technology Innovation Institute PLA Academy of Military Science
Current assignee: National Defense Technology Innovation Institute PLA Academy of Military Science
Priority date: 2020-11-17
Filing date: 2020-11-17
Publication date: 2021-03-12

Abstract

The invention relates to an unmanned aerial vehicle ground detection method and system based on a deep neural network, which comprises the following steps: acquiring a first position of a target to be detected from images acquired frame by frame; taking the first position as an initial target position of target tracking, continuously determining the target position in the next frame of image according to a candidate area corresponding to the target position to be detected in the current frame of image, and acquiring the first position of the target to be detected from the image collected frame by frame again and tracking when the target tracking fails; the technical scheme provided by the invention can monitor the acquired video in real time, thereby improving the target detection efficiency and accuracy; meanwhile, the target detection method adopting the deep neural network in the technical scheme provided by the invention has small calculated amount and higher practicability.

Description

Unmanned aerial vehicle ground detection method and system based on deep neural network

Technical Field

The invention relates to the technical field of computer vision, in particular to an unmanned aerial vehicle ground detection method and system based on a deep neural network.

Background

The current deep neural network is developed rapidly and is applied more and more widely, and a method for detecting or searching a target on a video or an image by using the deep neural network mainly comprises a two-step method represented by FasterR-CNN, R-CNN and the like and a one-step method represented by YOLO, SSD and the like; although FasterR-CNN is an excellent algorithm in the two-step method, the FasterR-CNN can only reach 5FPS processing speed under the support of strong computing power of a K40GPU, and cannot meet the requirement of real-time performance; although the speed of the YOLO and SSD target detection in the one-step method can reach more than 15FPS and can reach the real-time requirement, the calculation capability of TitanX or M40GPU is required to support. Algorithms with better performance and higher speed in the target tracking algorithm are represented by related filtering algorithms, and the algorithms have stable tracking and higher speed and can reach 172FPS under limited computing power.

The unmanned aerial vehicle is a reusable aircraft which is controlled by radio remote control or autonomous program control and is pilotless, and has the advantages of simple structure, low cost, strong survival ability, good maneuvering performance and capability of completing various tasks; however, the unmanned aerial vehicle is low in bearing weight, so that the unmanned aerial vehicle cannot carry computing equipment with strong computing performance, and therefore, the target detection algorithm based on the deep neural network is difficult to deploy, and the small unmanned aerial vehicle on-board computer such as a raspberry pie or an odroid is light in weight and limited in computing capacity; even if the faster one-step method in TinyYOLO or Mobilenes-SSD is deployed on the odroid on-board computer, the target detection speed does not exceed 3FPS, and the real-time requirement cannot be met. The retired predator unmanned aerial vehicle mainly obtains data through a sensor of the unmanned aerial vehicle and returns the data to the ground, and the data are manually interpreted on the ground; the improved global eagle portable signal sensor and the radar for detecting the ground moving target have primary on-board target detection and monitoring capability (distinguishing moving and static, detecting the moving target), and the detection technology is not mature enough; the rainbow unmanned aerial vehicle acquires data through a sensor of the unmanned aerial vehicle, returns the data to the ground, is manually interpreted on the ground, and is further processed at the rear end; the artificial intelligence algorithm is tested on a scanning eagle, the test is started for only a few days, the identification accuracy of a computer to objects such as personnel, vehicles, buildings and the like reaches 60%, and is improved to 80% after 1 week, however, the application is still finished on the ground; therefore, the current technology still cannot realize the processing operation of tracking and detecting the target in the data acquired by the airborne camera of the unmanned aerial vehicle in real time and carrying out the next indication.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide an unmanned aerial vehicle ground detection method and system based on a deep neural network, and the method for detecting and tracking a specific target in data acquired from an airborne camera in real time in the flight process of the unmanned aerial vehicle by combining a target detection algorithm and a tracking algorithm based on the deep neural network is utilized to realize the monitoring and searching of the tactical unmanned aerial vehicle on the ground target, the directional tracking on a moving target and the detection and tracking on an aerial target.

The purpose of the invention is realized by adopting the following technical scheme:

the invention provides an unmanned aerial vehicle ground detection method based on a deep neural network, which is improved in that the method comprises the following steps:

step 1) acquiring a first position of a target to be detected from images acquired frame by frame;

and step 2) taking the first position as an initial target position of target tracking, continuously determining the target position in the next frame of image according to a candidate area corresponding to the target position to be detected in the current frame of image, and returning to the step 1) when the target tracking fails.

Preferably, the condition for determining the target tracking failure includes any one of:

when the target detection duration is longer than the target tracking duration and the target detection duration is a positive integral multiple of the target tracking duration; or

When the target to be detected is not detected in the current frame image;

the target detection duration is a time interval from inputting the images acquired frame by frame into a pre-trained target detection deep neural network model to obtaining the first position of the target to be detected in the images;

the target tracking duration is the time interval for obtaining the position of the target to be detected in every two frames of images.

Further, the value range of the positive integer multiple is [1,100 ].

Preferably, the continuously determining the target position in the next frame of image according to the candidate region corresponding to the target position to be detected in the current frame of image includes:

acquiring a candidate area corresponding to a position area of a target to be detected in a current frame image;

searching a region consistent with a candidate region corresponding to the position region of the target to be detected in the current frame image in the next frame image by using a kernel correlation filtering algorithm, and taking the region as the candidate region corresponding to the position region of the target to be detected in the next frame image;

and storing the position area of the target to be detected in each frame of image into a target position set.

Further, after the storing the position area of the target to be measured in each frame of image to the target position set, the method further includes:

judging whether the number of the position areas of the target to be detected, which is stored in the target position set, exceeds a preset number threshold K or not;

if not, outputting the position area of the target to be detected which is currently stored;

and if so, updating the number of the position areas of the target to be detected in the target position set in a mode of abandoning the position area saved earliest, and outputting the updated position area of the target to be detected.

Further, the searching for a region consistent with a candidate region corresponding to a position region of the target to be detected in the current frame image by using the kernel correlation filtering algorithm includes:

expanding a candidate area corresponding to the position area of the target to be detected in the current frame image by a preset multiple to serve as an area where the candidate area corresponding to the position area of the target to be detected in the next frame image is consistent with the candidate area corresponding to the position area of the target to be detected in the current frame image;

wherein, the value range of the preset multiple is [1.5,3 ].

Preferably, the step 1) includes: and inputting the video images acquired frame by frame to a pre-trained target detection deep neural network model, executing forward reasoning of the deep neural network, and acquiring the first position of the target to be detected in the images.

Preferably, the training process of the pre-trained target detection deep neural network model includes:

carrying out frame-by-frame labeling on various targets in the historical video data acquired frame-by-frame;

constructing training data by using the historical video data labeled frame by frame, and training a target detection deep neural network model by using the training data;

and acquiring a pre-trained target detection deep neural network model.

The invention provides an unmanned aerial vehicle ground detection system based on a deep neural network, which is improved in that the system comprises:

the detection module is used for acquiring the first position of the target to be detected from the image acquired frame by frame;

and the tracking module is used for taking the first position as an initial target position of target tracking, continuously determining the target position in the next frame of image according to the candidate area corresponding to the target position to be detected in the current frame of image, and returning to the detection module if the target tracking fails.

When the target to be detected is not detected in the current frame image;

Further, the value range of the positive integer multiple is [1,100 ].

Preferably, the tracking module includes:

the acquisition unit is used for acquiring a candidate area corresponding to the position area of the target to be detected in the current frame image;

the searching unit is used for searching a region consistent with a candidate region corresponding to the position region of the target to be detected in the current frame image in the next frame image by utilizing a kernel correlation filtering algorithm, and taking the region as the candidate region corresponding to the position region of the target to be detected in the next frame image;

and the storage unit is used for storing the position area of the target to be detected in each frame of image into the target position set.

Further, the tracking module further includes:

the judging unit is used for judging whether the number of the position areas of the target to be detected, which is stored in the target position set, exceeds a preset number threshold K or not;

Further, the search unit is specifically configured to:

wherein, the value range of the preset multiple is [1.5,3 ].

Preferably, the detection module is specifically configured to: and inputting the video images acquired frame by frame to a pre-trained target detection deep neural network model, executing forward reasoning of the deep neural network, and acquiring the first position of the target to be detected in the images.

Further, the training process of the pre-trained target detection deep neural network model includes:

and acquiring a pre-trained target detection deep neural network model.

Compared with the closest prior art, the invention has the following beneficial effects:

in the technical scheme provided by the invention, the main implementation steps comprise the steps of acquiring the first position of a target to be detected from images acquired frame by frame; taking the first position as an initial target position of target tracking, and continuously determining a target position in the next frame of image according to a candidate area corresponding to a target position to be detected in the current frame of image; if the target tracking fails, acquiring the first position of the target to be detected from the image acquired frame by frame again and tracking; the technical scheme provided by the invention can monitor the acquired video in real time, thereby improving the target detection efficiency and accuracy; meanwhile, the target detection method adopting the deep neural network in the technical scheme provided by the invention has small calculated amount and higher practicability.

The technical scheme provided by the invention also provides a trained target detection deep neural network model, forward reasoning is carried out on video data acquired frame by frame to obtain the position area of the target to be detected, a candidate area corresponding to the position area of the target to be detected in the current video frame and an area consistent with the candidate area corresponding to the current video frame in the next video frame are obtained, the position area of the target to be detected is determined, and the next operation is determined according to the judgment condition of target tracking failure; the technical scheme can keep the advantage of high precision of the deep neural network target detection algorithm, can overcome the defect of low speed of the deep neural network target detection algorithm, and monitors the acquired video in real time; when the target tracking algorithm fails to track, the target detection algorithm can be used in time to obtain the correct target position again, and the multi-scale target tracking algorithm can track the multi-scale target; when a plurality of targets exist in the video, the tracking algorithm can avoid the jumping of a target frame of the target detection algorithm on different targets; meanwhile, the technical scheme provided by the invention has small calculation amount, does not need huge calculation capacity such as the support of a GPU (graphics processing unit) display card, can be deployed on an onboard computer of a small unmanned aerial vehicle, and has important application value.

Drawings

Fig. 1 is a flowchart of a method for detecting the ground of an unmanned aerial vehicle based on a deep neural network according to embodiment 1 of the present invention;

fig. 2 is a flowchart of a specific implementation of a method for detecting the ground of an unmanned aerial vehicle based on a deep neural network according to embodiment 2 of the present invention;

FIG. 3 is a training flowchart of a deep neural network-based target detection model provided in embodiment 2 of the present invention;

FIG. 4 is a flowchart of real-time target detection based on deep neural network provided in embodiment 2 of the present invention

FIG. 5 is a flowchart of target tracking based on deep neural network provided in embodiment 2 of the present invention;

fig. 6 is a structural diagram of a ground detection system of a drone based on a deep neural network according to embodiment 3 of the present invention.

Detailed Description

The following describes embodiments of the present invention in further detail with reference to the accompanying drawings.

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1

The invention provides an unmanned aerial vehicle ground detection method based on a deep neural network, which comprises the following steps of as shown in figure 1

In an embodiment of the present invention, the determination condition of the target tracking failure in step 2) includes any one of:

When the target to be detected is not detected in the current frame image;

the target detection duration is the time interval from inputting the images acquired frame by frame into a pre-trained target detection deep neural network model to obtaining the first position of the target to be detected in the images;

Wherein, the value range of the positive integer multiple is [1,100 ].

In the embodiment of the present invention, the step 2) of continuously determining the target position in the next frame image according to the candidate region corresponding to the target position to be detected in the current frame image includes:

searching a region consistent with a candidate region corresponding to a position region of the target to be detected in the current frame image in the next frame image by using a Kernel Correlation Filter (KCF) algorithm, and taking the region as the candidate region corresponding to the position region of the target to be detected in the next frame image;

After the step 2) of saving the position area of the target to be measured in each frame of image to the target position set, the method further comprises the following steps:

In the embodiment of the present invention, searching for a region consistent with a candidate region corresponding to a position region of a target to be detected in a current frame image by using a KCF algorithm in a next frame image includes:

wherein, the value range of the preset multiple is [1.5,3 ].

In an embodiment of the present invention, the acquiring the first position of the target to be measured in step 1) includes: and inputting the video images acquired frame by frame to a pre-trained target detection deep neural network model, executing forward reasoning of the deep neural network, and acquiring the first position of the target to be detected in the images.

The training process of the pre-trained target detection deep neural network model comprises the following steps:

and acquiring a pre-trained target detection deep neural network model.

Example 2

The embodiment provides a specific implementation process of an unmanned aerial vehicle ground detection method based on a deep neural network, as shown in fig. 2, the method includes:

training a target detection deep neural network model to obtain a model file and a weight file;

step (2) collecting video data frame by frame;

step (3) recording the current time t₁₀；

Step (4) detecting a model file and a weight file of the deep neural network based on the trained target, and performing forward reasoning on the video data acquired frame by frame to obtain a position area of the target to be detected;

step (5) recording the current time t₁₁And determining the detection time t of the detection target according to the following formula₁：t₁＝t₁₁-t₁₀；

Step (6) numbering the video frames of the position area of the target to be detected as 1, and sequentially numbering the video data acquired frame by frame after the initial video frame in the video data acquired frame by frame;

step (7) recording the current time t₂₀；

Step (8) initializing i to 1;

step (9) initializing j to 1;

step (10) obtaining a candidate area corresponding to the position area of the target to be detected in the video frame with the number i and an area which is consistent with the candidate area corresponding to the video frame with the number i in the video frame with the number i +1, and taking the area as the candidate area corresponding to the position area of the target to be detected in the video frame with the number i + 1;

step (11) acquiring a position area of a target to be detected in a candidate area corresponding to the position area of the target to be detected in the video frame with the number of i +1, setting j to j +1, and storing the position area of the target to be detected as a jth target position in a target position set;

step (12) judging whether j is larger than the preset target position number K in the target position set, if not, outputting the jth position of the target to be detected, and executing step (13); if yes, abandoning the earliest stored target position in the target position set, outputting the jth position of the target to be detected, and executing the step (13);

step (13) of recording the current time t₂₁And determining the tracking time t of the detected target object according to the following formula₂：t₂＝t₂₁-t₂₀；

Step (14) if t₂≥αt₁Then go to step (4), if t₂＜αt₁Then let i equal i +1 and execute step (9).

Preferably, step (1) comprises:

and obtaining a model file and a weight file of the trained target detection deep neural network.

Preferably, step (3) comprises:

and sequentially reading a label corresponding to the target to be detected, a trained model file and a weight file of the target detection deep neural network and video data acquired frame by using the forward reasoning frame, and acquiring the position of the target to be detected output by the forward reasoning frame.

In an embodiment of the present invention, the off-line training of the target detection deep neural network includes:

a-1, labeling the video data of the same type aiming at a specific target needing to be detected and tracked, and performing offline training on a deep neural network by using the labeled data on a GPU server or a computer with stronger performance;

step A-2, decomposing the same type of video data acquired by the unmanned aerial vehicle into images, wherein the number of the images is as large as possible, and is usually not less than 1 ten thousand to avoid overfitting and improve the generalization capability; labeling targets (automobiles, people, tanks, unmanned planes and the like) in each image; specifically, the method comprises the following steps: framing the target by using a rectangular frame, and recording pixel coordinates of the top left corner and the bottom right corner of the rectangular frame or vertex coordinates of the top left corner, the length and the width of the rectangular frame and a corresponding target label according to a specific format;

a-3, building a deep neural network training platform (TensorFlow, Darknet, Caffe and the like), setting parameters such as a training path size and a learning rate, reading a model of the deep neural network such as a Mobilenets-SSD, and updating parameters of the deep neural network model of the specific target detection algorithm on marked data;

and step A-4, after training for a specific number of times (more than 10000 rounds), storing the training model of the deep neural network, and obtaining a model file and a weight file of the training model of the deep neural network.

Secondly, detecting a target:

b-1, loading video data and reading video frames;

step B-2, recording the current time t of the timer 1₁₀；

And B-3, loading a pre-training model based on a deep learning algorithm, and detecting a specific target on the read video frame by utilizing a deep learning forward reasoning mechanism: reading a target class label, a pre-training parameter model file, a weight file and a video frame to be detected, and carrying out forward reasoning on a new video frame to obtain target position information and confidence;

step B-4, recording the current time t of the timer 1₁₁，t₁＝t₁₁-t₁₀And simultaneously, transmitting the detected target position to a target tracker.

And finally, tracking the target:

and C-1, initializing by the target tracker by taking the target position detected by the target detector as a tracking starting point.

Step C-2, recording the current time t of the timer 2₂₀；

C-3, tracking the target by a tracking algorithm, and updating the target position on a new video frame: determining the position of a candidate region of a current frame, and extracting the characteristics of the candidate region; searching a region which is most matched with the candidate region characteristics in a subsequent video frame as a target tracking object; enclosing the object by a rectangular frame to be used as a tracking result and storing the target position to a target position set;

c-4, if the number of the historical positions stored in the target position set of the same target exceeds K, abandoning the target position stored firstly, and outputting and displaying the latest position of the tracking target on the video image;

step C-5, recording the current time t of the timer 2₂₁Calculating a target tracking time t₂＝t₂₁-t₂₀If t is₂≥αt₁Go to step B-3, if t₂＜t₁Go to step C-3.

Preferably, the obtaining of the candidate region corresponding to the position region of the target to be detected in the video frame with the number i includes:

and expanding the position area of the target to be detected in the video frame with the number i by a preset multiple.

Further, the value range of the preset multiple is [1.5,3 ].

Preferably, the value range of alpha in the step (13) is [1,100 ].

Based on the technical solution provided by the present invention, the embodiment of the present invention further provides a training flowchart of a target detection model based on a deep neural network, as shown in fig. 3:

s1, off-line training of a target detection model:

s11, collecting videos or images aiming at a monitoring specific area, wherein the collected images or video scenes are required to be similar to the scenes of the actual unmanned aerial vehicle monitoring area as much as possible;

s12, marking various targets (vehicles, personnel, trees and the like) in the collected video or image frame by frame, preferably selecting a rectangular frame for the marking frame, positioning through the top points of the upper left corner and the lower right corner or positioning by adopting the long and wide sides of the upper left corner and the rectangle, storing marked coordinates and category labels as xml or txt file types according to a fixed format, establishing an index file, and enabling image paths and file names to correspond to the names of the xml or txt file paths one by one;

s13, selecting a training platform of the deep neural network, wherein the training platform can be, but is not limited to, a cafe, tenserflow, a pitorch and a darknet;

s14, selecting a target detection deep neural network including but not limited to a Mobilenets-SSD target detection neural network, setting parameters such as training pathsize and learning rate, reading a training image and a corresponding xml or txt file according to an index file, and training by using labeled data on a training platform selected in S13;

and S15, performing N rounds of training on the acquired data in the training process of S14, wherein N is usually not less than 10000, and storing the obtained model file for the subsequent real-time target detection process.

Based on the technical scheme provided by the invention, the embodiment of the invention also provides a target real-time detection flow chart based on the deep neural network, as shown in fig. 4:

s2, online real-time target detection:

s21, reading the video or image data of the camera frame by frame in real time on the unmanned aerial vehicle;

s22, recording the current time t of the timer 1₁₀；

S23, operating a lightweight forward reasoning framework which is convenient to deploy on a mobile platform, wherein the framework comprises but is not limited to an opencvDNN module, a TensorRT forward reasoning module, a Tencent NCNN forward reasoning module and a TENGNE forward reasoning module;

s24, reading the model weight file trained and stored in S15, detecting the selected target on the video or image read frame by frame, and acquiring and outputting corresponding information such as a target position rectangular frame, confidence coefficient, category label and the like;

s25, recording the current time t of the timer 1₁₁。

Based on the technical solution provided by the present invention, an embodiment of the present invention further provides a target tracking flow chart based on a deep neural network, as shown in fig. 5:

s3, specific steps of target tracking:

s31, taking a target position rectangular frame stored by the target tracker in S25 as an initial value of a target tracking algorithm, initializing the tracking algorithm in a current video frame, preferably selecting a KCF target tracking algorithm by the tracking algorithm, and storing the initial position of the target;

s32, recording the current time t of the timer 2₂₀；

S33, determining a template area larger than the target frame in the current frame by a KCF algorithm according to the initial target position, generally taking 1.5-3 times of the size of the target frame, and obtaining different displacement templates of the template area by using a cyclic matrix; moving according to the x-axis and the y-axis, respectively using the following cyclic matrices:

s34, extracting the characteristics of different displacement templates, multiplying the characteristics with a Hanning window to obtain a target template, and calculating a Gaussian kernel of the target template; the Hanning window is determined according to the formula:

wherein N is the window width;

s35, calculating a target position of the target template in the image through Fourier transform, and calculating a new target template according to the target position; and calculating a Gaussian response graph of the new target template, training the ridge regression model in a frequency domain, and updating the target template and the classifier parameter values.

S36, outputting and storing the target position;

s37, determining a template area larger than the target frame in the newly acquired frame according to the target position, generally taking 1.5-3 times of the size of the target frame, and acquiring different displacement templates of the template area by utilizing a cyclic matrix;

s38, extracting the characteristics of different displacement templates, and multiplying the characteristics by a Hanning window to obtain a target template;

s39, calculating a Gaussian kernel according to the size of the target template, and calculating a response graph by using the parameter value to obtain a target position; calculating a Gaussian kernel of a new target template, training a ridge regression model in a frequency domain, and updating the target template and the classifier parameter values;

s40, outputting and storing the target position;

s41, judging the number of the stored target positions, if the number of the stored target positions exceeds the preset number K of the target positions, abandoning the first stored target positions, and outputting the rest target positions, and if the number of the stored target positions does not exceed the preset number of the target positions, directly outputting the stored target positions;

s42, recording the current time t of the timer 2₂₁Calculating a target tracking time t₂＝t₂₁-t₂₀If t is₂≥αt₁Go to step S23, if t₂＜t₁Go to step S37.

Example 3

The embodiment provides an unmanned aerial vehicle ground detection system based on a deep neural network, as shown in fig. 6, including:

and the tracking module is used for taking the first position as an initial target position of target tracking, continuously determining the target position in the next frame of image according to the candidate area corresponding to the target position to be detected in the current frame of image, and returning to the detection module when the target tracking fails.

When the target to be detected is not detected in the current frame image;

Further, the value range of the positive integer multiple is [1,100 ].

Preferably, the tracking module comprises:

Further, the tracking module further comprises:

Further, the search unit is specifically configured to:

wherein, the value range of the preset multiple is [1.5,3 ].

Further, the training process of the pre-trained target detection deep neural network model comprises the following steps:

and acquiring a pre-trained target detection deep neural network model.

The unmanned aerial vehicle ground detection system provided by the embodiment of the invention or the electronic equipment loaded with the unmanned aerial vehicle ground detection method can be deployed on the unmanned aerial vehicle to realize monitoring and tracking of the target. .

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims

1. An unmanned aerial vehicle ground detection method based on a deep neural network is characterized by comprising the following steps:

2. The method according to claim 1, wherein the judgment condition of the target tracking failure includes any one of:

When the target to be detected is not detected in the current frame image;

3. The method of claim 2, wherein the positive integer multiple ranges from [1,100 ].

4. The method of claim 1, wherein said continuously determining the target position in the next frame of image according to the candidate region corresponding to the target position to be detected in the current frame of image comprises:

5. The method of claim 4, wherein after saving the location area of the object to be measured in each frame of image to the set of object locations, further comprising:

6. The method as claimed in claim 4, wherein said searching for a region in the next frame image consistent with the candidate region corresponding to the position region of the target to be detected in the current frame image by using the kernel correlation filtering algorithm comprises:

wherein, the value range of the preset multiple is [1.5,3 ].

7. The method of claim 1, wherein the step 1) comprises: and inputting the video images acquired frame by frame to a pre-trained target detection deep neural network model, executing forward reasoning of the deep neural network, and acquiring the first position of the target to be detected in the images.

8. The method of claim 1, wherein the training process of the pre-trained target detection deep neural network model comprises:

and acquiring a pre-trained target detection deep neural network model.

9. An unmanned aerial vehicle ground detection system based on a deep neural network, the system comprising:

10. The system according to claim 9, wherein the judgment condition of the target tracking failure includes any one of:

When the target to be detected is not detected in the current frame image;

11. The system of claim 10, wherein the positive integer multiple ranges from [1,100 ].

12. The system of claim 9, wherein the tracking module comprises:

13. The system of claim 12, wherein the tracking module further comprises:

14. The system of claim 12, wherein the lookup unit is specifically configured to:

wherein, the value range of the preset multiple is [1.5,3 ].

15. The system of claim 9, wherein the detection module is specifically configured to: and inputting the video images acquired frame by frame to a pre-trained target detection deep neural network model, executing forward reasoning of the deep neural network, and acquiring the first position of the target to be detected in the images.

16. The system of claim 9, wherein the training process of the pre-trained target detection deep neural network model comprises:

and acquiring a pre-trained target detection deep neural network model.