CN116309727A

CN116309727A - Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm

Info

Publication number: CN116309727A
Application number: CN202310604700.0A
Authority: CN
Inventors: 唐枫; 戴明哲; 张昕瑜; 郑雅恬; 李文涛
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2023-05-26
Filing date: 2023-05-26
Publication date: 2023-06-23

Abstract

The invention discloses an unmanned aerial vehicle target tracking method based on a deep learning algorithm, which comprises the steps of obtaining existing image data for unmanned aerial vehicle target tracking and constructing a training data set; constructing a boundary frame prediction model for unmanned aerial vehicle target tracking and training to obtain a final boundary frame prediction model; inputting a template image of a target and a search image acquired by the unmanned aerial vehicle in real time into a boundary frame prediction model to obtain a target boundary frame; tracking the target by adopting a target boundary box; repeating the steps, and adopting the unmanned aerial vehicle to complete real-time tracking of the target. The invention also discloses a simulation system comprising the unmanned aerial vehicle target tracking method based on the deep learning algorithm. According to the invention, through the design of an innovative target tracking method, the unmanned aerial vehicle tracking of the target is realized, and the reliability is high, the accuracy is good and the efficiency is high; meanwhile, based on the simulation system provided by the invention, the complex scene adaptability of unmanned aerial vehicle target tracking can be improved.

Description

Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm

Technical Field

The invention belongs to the technical field of target tracking, and particularly relates to an unmanned aerial vehicle target tracking method and a simulation system based on a deep learning algorithm.

Background

Along with the development of economic technology, unmanned aerial vehicles are widely applied to the production and life of people, and bring endless convenience to the production and life of people. Unmanned aerial vehicle target tracking is the hotspot problem of unmanned aerial vehicle application field, and its aim at control unmanned aerial vehicle is continuous to target tracking to realize specific purpose, such as target tracking, demarcation of target motion track, target follow-up beat etc..

In the field of unmanned aerial vehicle target tracking, two main problems exist at present: firstly, in the target tracking process of the unmanned aerial vehicle, the conditions of rapid movement, scale change, severe deformation, target shielding and the like of the target greatly interfere with the target tracking process of the unmanned aerial vehicle, and the tracking difficulty and the tracking process accuracy are accelerated; secondly, in the tracking algorithm test (or tracking algorithm trial-and-error) process of the unmanned aerial vehicle in the target tracking process, the test cost of the unmanned aerial vehicle, especially the trial-and-error cost is high.

At present, in the target tracking process of the traditional unmanned aerial vehicle, the characteristics of a target need to be manually extracted firstly; the process is time-consuming and labor-consuming, and a series of problems such as weak anti-interference capability, poor real-time performance, easy target loss and the like can be caused in the implementation process; this makes the reliability and accuracy of existing drone target tracking schemes poor. Moreover, at present, the traditional unmanned aerial vehicle test or trial-and-error process lacks a corresponding and reliable simulation system, and the feasibility test difficulty and the test cost of the unmanned aerial vehicle tracking scheme are greatly increased.

Disclosure of Invention

The invention aims to provide an unmanned aerial vehicle target tracking method based on a deep learning algorithm, which is high in reliability, accuracy and efficiency.

The second object of the invention is to provide a simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm.

The unmanned aerial vehicle target tracking method based on the deep learning algorithm provided by the invention comprises the following steps:

s1, acquiring existing image data for target tracking of an unmanned aerial vehicle, and constructing a training data set;

s2, constructing a boundary frame prediction model for unmanned aerial vehicle target tracking based on an ANN network, an APN network and an attention mechanism module;

s3, training the boundary frame prediction model constructed in the step S2 by adopting the training data set constructed in the step S1 to obtain a final boundary frame prediction model;

s4, inputting the template image of the target and the search image acquired by the unmanned aerial vehicle in real time into the final boundary frame prediction model obtained in the step S3 to obtain a target boundary frame;

s5, tracking the target by adopting the target boundary box obtained in the step S4;

and S6, repeating the steps S4-S5, and completing real-time tracking of the target by adopting the unmanned aerial vehicle.

The step S2 specifically comprises the following steps:

firstly, processing an input template image and a search image through a backbone network;

constructing a fourth similarity graph and a fifth similarity graph according to an output image of the backbone network, and constructing an APN network feature graph based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network;

constructing and obtaining a channel attention network characteristic diagram according to an output image of the backbone network and a pooling layer;

based on the APN network feature map and the channel attention network feature map, calculating to obtain a final classification feature map;

and carrying out classification and regression operation on the finally obtained classification feature map to obtain a final boundary frame prediction result.

The step S2 specifically comprises the following steps:

A. image of templatezAnd searching for imagesxRespectively inputting the data into a backbone network; the backbone network is an AlexNet network;

B. performing deep cross-correlation operation and convolution operation on fourth-layer output of the backbone network to obtain a fourth similarity graph

Is->

Wherein->

For template imageszOutput of the fourth layer of the backbone network after input to the backbone network,>

to search for imagesxOutput of the fourth layer of the backbone network after input to the backbone network,>

for deep cross-correlation operations>

Is a convolution operation;

performing convolution operation and deep cross-correlation operation on fifth-layer output of the backbone network to obtain a fifth similarity graph

Is->

；

Based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network, constructing and obtaining an APN network characteristic graph

Is->

Wherein->

For the first learning weight, ++>

For the second learning weight->

For forward propagation network, < >>

For the global averaging pooling operation,

splicing the channel directions;

C. changing convolution kernel parameters of a fifth layer in the backbone network to obtain a sixth similarity graph

The method comprises the steps of carrying out a first treatment on the surface of the At the same time, the fifth similarity map->

Three feature maps are generated by three different convolution layers, respectively, q feature maps +.>

K characteristic map->

And v feature map->

，/>

The method comprises the steps of carrying out a first treatment on the surface of the Then will->

And->

The transformation scale is +.>

Performing matrix multiplication operation on the two matrices after the scale conversion, and obtaining a space attention diagram through a softmax layer

Softmax is an exponential normalization function; finally, a spatial attention map is obtained>

Is that

Wherein->

Is a third learning weight; wherein (1)>

Representing +.>

Dimension vector space,/->

Is +.>

Dimension vector space,/->

Is +.>

The dimension vector space is defined by the dimension vector space,CCin order to provide the number of channels,HHin order to be of a height, the height,WWis the width;

from a spatial attention diagram

And a pooling layer, calculating to obtain a channel attention network characteristic diagram +.>

Is that

Wherein->

For the fourth learning weight->

Is a nonlinear activation function and is calculated as +.>

，WIs an intermediate feature and->

，

Performing global maximum pooling operation;

D. network characteristic diagram based on APN

And channel attention network profile->

Calculating to obtain a final classification characteristic diagramRIs->

Wherein->

For the fifth learning weight, ++>

Is a sixth learning weight;

E. the finally obtained fractionClass feature mapRAnd performing classification and regression operation to obtain a final boundary box prediction result.

The classification and regression operation adopts a three-branch structure; the first branch selects the bounding box with the largest intersection with the actual bounding box; the second branch selects a point on the feature map that falls within the actual bounding box; the last branch considers the center distance between each point and the actual bounding box center point; finally, different weights are introduced to balance different branches, and the total loss function is obtained as

Is->

Wherein->

For cross entropy->

Is binary cross entropy->

、/>

And->

Are the respective weights of the three branches.

The invention also discloses a simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm, which comprises a physical simulation module, a binocular camera, a cloud deck and an unmanned aerial vehicle target tracking module; the output end of the physical simulation module is simultaneously connected with the input end of the binocular camera and the input end of the cradle head; the output end of the binocular camera and the output end of the cradle head are simultaneously connected with an unmanned aerial vehicle target tracking module; the physical simulation module is used for sending data information of the virtual world to the binocular camera and the cradle head; after receiving the data information of the virtual world, the binocular camera forwards the image to the unmanned aerial vehicle target tracking module; the cloud deck calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the unmanned aerial vehicle target tracking module; the unmanned aerial vehicle target tracking module is used for controlling the unmanned aerial vehicle to track the target according to the unmanned aerial vehicle target tracking method based on the deep learning algorithm.

The unmanned aerial vehicle target tracking module comprises a target boundary frame prediction module, a flight controller, a flight control information transmission module and a tracker; the output end of the binocular camera is connected with the input end of the target boundary frame prediction module, and the output end of the target boundary frame prediction module is connected with the first input end of the tracker; the output end of the cradle head is connected with the second input end of the tracker; the output end of the flight control information transmission module is connected with the third input end of the tracker; the communication end of the flight control information transmission module is connected with the communication end of the flight controller; the output end of the tracker is connected with the input end of the flight controller; after receiving the data information of the virtual world, the binocular camera forwards the image to a target boundary box prediction module; the target boundary frame prediction module is used for calculating a target boundary frame and uploading target boundary frame information to the tracker; the cradle head calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the tracker; the flight control information transmission module is used for communicating with the flight controller, and simultaneously, the flight control information transmission module sends world coordinate information of the unmanned aerial vehicle to the tracker; the tracker calculates linear speed information and angular speed information of the unmanned aerial vehicle in each direction at the next moment according to the received data information by a double-target principle, and sends the information to the flight controller; and the flight controller is used for controlling the unmanned aerial vehicle to track according to the received information.

The physical simulation module is a Gazebo simulation platform.

The flight control information transmission module is a Mavros communication module.

And the flight controller and the flight control information transmission module are communicated through a Mavlink protocol.

According to the unmanned aerial vehicle target tracking method and the simulation system based on the deep learning algorithm, through the design of the innovative target tracking method, unmanned aerial vehicle tracking of the target is realized, and the unmanned aerial vehicle target tracking method and the simulation system based on the deep learning algorithm are high in reliability, good in accuracy and high in efficiency; meanwhile, based on the simulation system provided by the invention, the complex scene adaptability of unmanned aerial vehicle target tracking can be improved.

Drawings

FIG. 1 is a schematic flow chart of the method of the present invention.

FIG. 2 is a schematic diagram of functional modules of the system of the present invention.

FIG. 3 is a schematic diagram of a target template according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of a simulation scenario and a unmanned aerial vehicle according to an embodiment of the present invention.

Fig. 5 is a schematic diagram of adjusting a unmanned aerial vehicle to a proper position according to an embodiment of the system of the present invention.

FIG. 6 is a schematic diagram of a tracking process according to an embodiment of the present invention.

Detailed Description

A schematic process flow diagram of the method of the present invention is shown in fig. 1: the unmanned aerial vehicle target tracking method based on the deep learning algorithm provided by the invention comprises the following steps:

s2, constructing a boundary frame prediction model for unmanned aerial vehicle target tracking based on an ANN network (Artificial Neural Network ), an APN network (Anchor Proposal Network, regional advice network) and an attention mechanism module; the method specifically comprises the following steps:

classifying and regressing the finally obtained classification characteristic diagram to obtain a final boundary frame prediction result;

the specific implementation is realized by the following steps:

Is->

Wherein->

for deep cross-correlation operations>

Is a convolution operation; the purpose of the convolution operation is to reduce the number of channels of the two feature maps;

Is->

；

Based on the fourth similarity diagram, the fifth similarity diagram, the pooling layer and the forward propagation network, constructing and obtaining an APN networkFeature map

Is->

Wherein->

For the first learning weight, ++>

For the second learning weight->

For forward propagation network, < >>

For global average pooling operation,/->

Splicing the channel directions; the purpose of the APN network feature map is to maintain cross interdependence similarity;

K characteristic map->

And v feature map->

，/>

And->

The transformation scale is +.>

Is that

Wherein->

Is a third learning weight; wherein (1)>

Representing in the real number domain

Dimension vector space,/->

Is +.>

Dimension vector space,/->

Is +.>

from a spatial attention diagram

Is that

Wherein->

For the fourth learning weight->

Is a nonlinear activation function and is calculated as +.>

，WIs an intermediate feature and->

，

Performing global maximum pooling operation;

D. network characteristic diagram based on APN

And channel attention network profile->

Calculating to obtain a final classification characteristic diagramRIs->

Wherein->

For the fifth learning weight, ++>

Is a sixth learning weight;

E. the finally obtained classification characteristic diagramRPerforming classification and regression operations to obtain final boundary frame predictionResults;

in specific implementation, the classification and regression operation adopts a three-branch structure; the first branch selects the bounding box with the largest intersection with the actual bounding box; the second branch selects a point on the feature map that falls within the actual bounding box; the last branch considers the center distance between each point and the actual bounding box center point; finally, different weights are introduced to balance different branches, and the total loss function is obtained as

Is->

Wherein->

In order to achieve cross-entropy,

is binary cross entropy->

、/>

And->

Corresponding weights for the three branches;

FIG. 2 is a schematic diagram of functional modules of the system of the present invention: the simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm comprises a physical simulation module, a binocular camera, a cradle head and an unmanned aerial vehicle target tracking module; the output end of the physical simulation module is simultaneously connected with the input end of the binocular camera and the input end of the cradle head; the output end of the binocular camera and the output end of the cradle head are simultaneously connected with an unmanned aerial vehicle target tracking module; the physical simulation module is used for sending data information of the virtual world to the binocular camera and the cradle head; after receiving the data information of the virtual world, the binocular camera forwards the image to the unmanned aerial vehicle target tracking module; the cloud deck calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the unmanned aerial vehicle target tracking module; the unmanned aerial vehicle target tracking module is used for controlling the unmanned aerial vehicle to track the target according to the unmanned aerial vehicle target tracking method based on the deep learning algorithm.

In specific implementation, the unmanned aerial vehicle target tracking module comprises a target boundary frame prediction module, a flight controller, a flight control information transmission module and a tracker; the output end of the binocular camera is connected with the input end of the target boundary frame prediction module, and the output end of the target boundary frame prediction module is connected with the first input end of the tracker; the output end of the cradle head is connected with the second input end of the tracker; the output end of the flight control information transmission module is connected with the third input end of the tracker; the communication end of the flight control information transmission module is connected with the communication end of the flight controller; the output end of the tracker is connected with the input end of the flight controller; after receiving the data information of the virtual world, the binocular camera forwards the image to a target boundary box prediction module; the target boundary frame prediction module is used for calculating a target boundary frame and uploading target boundary frame information to the tracker; the cradle head calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the tracker; the flight control information transmission module is used for communicating with the flight controller, and simultaneously, the flight control information transmission module sends world coordinate information of the unmanned aerial vehicle to the tracker; the tracker calculates linear speed information and angular speed information of the unmanned aerial vehicle in each direction at the next moment according to the received data information by a double-target principle, and sends the information to the flight controller; and the flight controller is used for controlling the unmanned aerial vehicle to track according to the received information.

Meanwhile, the physical simulation module is a Gazebo simulation platform; the flight control information transmission module is a Mavros communication module; and the flight controller and the flight control information transmission module are communicated through a Mavlink protocol.

The simulation system provided by the invention is further described below:

a target template frame needs to be provided for the algorithm before the system is started, and in this example, the target template is shown in fig. 3;

the workflow of the simulation system is described in detail below:

firstly, virtual scene simulation and unmanned aerial vehicle construction are started, the scene after operation is shown in fig. 4, and the unmanned aerial vehicle is positioned in the center of a road; the pedestrians slowly move forwards on the left side of the unmanned plane;

then, establishing Mavros communication, establishing a keyboard by the Mavros to communicate with the unmanned aerial vehicle, and providing world coordinate information of the unmanned aerial vehicle for a tracker;

starting a keyboard control node, and controlling the unmanned aerial vehicle to fly to a proper position so that the unmanned aerial vehicle can observe a target for a long time; setting the unmanned aerial vehicle to a hover (river) state after the adjustment is completed, and closing a keyboard control node (the non-closing state possibly collides with a follow-up tracker node);

starting a cradle head control, and providing relative coordinate system information of the unmanned aerial vehicle for a tracker;

starting a tracker node, receiving information transmitted by Mavros and a cradle head, and waiting for target boundary frame information transmitted by a target boundary frame prediction module; no one has the opportunity to remain hovering (home) until the target bounding box information is received, as shown in fig. 5;

starting a target boundary frame prediction module when a target completely appears in the observation range of the unmanned aerial vehicle camera, starting to predict the target boundary frame after starting, transmitting target boundary frame information to a tracker, calculating movement speed information by the tracker according to information issued by the Mavros, the cradle head and the target boundary frame prediction module, and controlling and adjusting the unmanned aerial vehicle to fly by a flight controller according to the movement speed information;

after all nodes are started, the system starts to operate, and the process is circulated to continuously track the target; FIG. 6 is a schematic diagram of a tracking process of the system of the present invention.

Claims

1. The unmanned aerial vehicle target tracking method based on the deep learning algorithm is characterized by comprising the following steps of:

2. The unmanned aerial vehicle target tracking method based on the deep learning algorithm according to claim 1, wherein the step S2 specifically comprises the following steps:

3. The unmanned aerial vehicle target tracking method based on the deep learning algorithm according to claim 2, wherein the step S2 specifically comprises the following steps:

Is->

Wherein->

for deep cross-correlation operations>

Is a convolution operation;

Is that

；

Is->

Wherein->

For the first learning weight, ++>

For the second learning weight->

For forward propagation network, < >>

For global average pooling operation,/->

Splicing the channel directions;

K characteristic map->

And v feature map->

，/>

And->

The transformation scale is +.>

Performing matrix multiplication operation on the two matrices after the scale conversion, and obtaining a spatial attention force diagram through a softmax layer>

Is->

Wherein

Is a third learning weight; wherein (1)>

Representing +.>

The dimension vector space is defined by the dimension vector space,

is +.>

Dimension vector space,/->

Is +.>

from a spatial attention diagram

Is that

Wherein->

For the fourth learning weight->

Is a nonlinear activation function and is calculated as +.>

，WIs an intermediate feature and->

，

Performing global maximum pooling operation;

D. network characteristic diagram based on APN

And channel attention network profile->

Calculated to obtainFinal classification feature mapRIs->

Wherein->

For the fifth learning weight, the first learning weight,

is a sixth learning weight;

E. the finally obtained classification characteristic diagramRAnd performing classification and regression operation to obtain a final boundary box prediction result.

4. The unmanned aerial vehicle target tracking method based on the deep learning algorithm according to claim 3, wherein the classification and regression operation, in particular, the classification and regression operation adopts a three-branch structure; the first branch selects the bounding box with the largest intersection with the actual bounding box; the second branch selects a point on the feature map that falls within the actual bounding box; the last branch considers the center distance between each point and the actual bounding box center point; finally, different weights are introduced to balance different branches, and the total loss function is obtained as

Is->

Wherein->

For cross entropy->

Is binary cross entropy->

、/>

And->

Are the respective weights of the three branches.

5. A simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm according to one of claims 1 to 4, which is characterized by comprising a physical simulation module, a binocular camera, a cloud deck and an unmanned aerial vehicle target tracking module; the output end of the physical simulation module is simultaneously connected with the input end of the binocular camera and the input end of the cradle head; the output end of the binocular camera and the output end of the cradle head are simultaneously connected with an unmanned aerial vehicle target tracking module; the physical simulation module is used for sending data information of the virtual world to the binocular camera and the cradle head; after receiving the data information of the virtual world, the binocular camera forwards the image to the unmanned aerial vehicle target tracking module; the cloud deck calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the unmanned aerial vehicle target tracking module; the unmanned aerial vehicle target tracking module is used for controlling the unmanned aerial vehicle to track the target according to the unmanned aerial vehicle target tracking method based on the deep learning algorithm.

6. The simulation system according to claim 5, wherein the unmanned aerial vehicle target tracking module comprises a target bounding box prediction module, a flight controller, a flight control information transmission module and a tracker; the output end of the binocular camera is connected with the input end of the target boundary frame prediction module, and the output end of the target boundary frame prediction module is connected with the first input end of the tracker; the output end of the cradle head is connected with the second input end of the tracker; the output end of the flight control information transmission module is connected with the third input end of the tracker; the communication end of the flight control information transmission module is connected with the communication end of the flight controller; the output end of the tracker is connected with the input end of the flight controller; after receiving the data information of the virtual world, the binocular camera forwards the image to a target boundary box prediction module; the target boundary frame prediction module is used for calculating a target boundary frame and uploading target boundary frame information to the tracker; the cradle head calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the tracker; the flight control information transmission module is used for communicating with the flight controller, and simultaneously, the flight control information transmission module sends world coordinate information of the unmanned aerial vehicle to the tracker; the tracker calculates linear speed information and angular speed information of the unmanned aerial vehicle in each direction at the next moment according to the received data information by a double-target principle, and sends the information to the flight controller; and the flight controller is used for controlling the unmanned aerial vehicle to track according to the received information.

7. The simulation system of claim 5 or 6, wherein the physical simulation module is a Gazebo simulation platform.

8. The simulation system of claim 6, wherein the flight control information transmission module is a Mavros communication module.

9. The simulation system of claim 6, wherein the flight controller and the flight control information transfer module communicate via a Mavlink protocol.