CN116309727A - Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm - Google Patents

Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm Download PDF

Info

Publication number
CN116309727A
CN116309727A CN202310604700.0A CN202310604700A CN116309727A CN 116309727 A CN116309727 A CN 116309727A CN 202310604700 A CN202310604700 A CN 202310604700A CN 116309727 A CN116309727 A CN 116309727A
Authority
CN
China
Prior art keywords
unmanned aerial
aerial vehicle
target
module
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310604700.0A
Other languages
Chinese (zh)
Inventor
唐枫
戴明哲
张昕瑜
郑雅恬
李文涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202310604700.0A priority Critical patent/CN116309727A/en
Publication of CN116309727A publication Critical patent/CN116309727A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/62Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an unmanned aerial vehicle target tracking method based on a deep learning algorithm, which comprises the steps of obtaining existing image data for unmanned aerial vehicle target tracking and constructing a training data set; constructing a boundary frame prediction model for unmanned aerial vehicle target tracking and training to obtain a final boundary frame prediction model; inputting a template image of a target and a search image acquired by the unmanned aerial vehicle in real time into a boundary frame prediction model to obtain a target boundary frame; tracking the target by adopting a target boundary box; repeating the steps, and adopting the unmanned aerial vehicle to complete real-time tracking of the target. The invention also discloses a simulation system comprising the unmanned aerial vehicle target tracking method based on the deep learning algorithm. According to the invention, through the design of an innovative target tracking method, the unmanned aerial vehicle tracking of the target is realized, and the reliability is high, the accuracy is good and the efficiency is high; meanwhile, based on the simulation system provided by the invention, the complex scene adaptability of unmanned aerial vehicle target tracking can be improved.

Description

Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm
Technical Field
The invention belongs to the technical field of target tracking, and particularly relates to an unmanned aerial vehicle target tracking method and a simulation system based on a deep learning algorithm.
Background
Along with the development of economic technology, unmanned aerial vehicles are widely applied to the production and life of people, and bring endless convenience to the production and life of people. Unmanned aerial vehicle target tracking is the hotspot problem of unmanned aerial vehicle application field, and its aim at control unmanned aerial vehicle is continuous to target tracking to realize specific purpose, such as target tracking, demarcation of target motion track, target follow-up beat etc..
In the field of unmanned aerial vehicle target tracking, two main problems exist at present: firstly, in the target tracking process of the unmanned aerial vehicle, the conditions of rapid movement, scale change, severe deformation, target shielding and the like of the target greatly interfere with the target tracking process of the unmanned aerial vehicle, and the tracking difficulty and the tracking process accuracy are accelerated; secondly, in the tracking algorithm test (or tracking algorithm trial-and-error) process of the unmanned aerial vehicle in the target tracking process, the test cost of the unmanned aerial vehicle, especially the trial-and-error cost is high.
At present, in the target tracking process of the traditional unmanned aerial vehicle, the characteristics of a target need to be manually extracted firstly; the process is time-consuming and labor-consuming, and a series of problems such as weak anti-interference capability, poor real-time performance, easy target loss and the like can be caused in the implementation process; this makes the reliability and accuracy of existing drone target tracking schemes poor. Moreover, at present, the traditional unmanned aerial vehicle test or trial-and-error process lacks a corresponding and reliable simulation system, and the feasibility test difficulty and the test cost of the unmanned aerial vehicle tracking scheme are greatly increased.
Disclosure of Invention
The invention aims to provide an unmanned aerial vehicle target tracking method based on a deep learning algorithm, which is high in reliability, accuracy and efficiency.
The second object of the invention is to provide a simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm.
The unmanned aerial vehicle target tracking method based on the deep learning algorithm provided by the invention comprises the following steps:
s1, acquiring existing image data for target tracking of an unmanned aerial vehicle, and constructing a training data set;
s2, constructing a boundary frame prediction model for unmanned aerial vehicle target tracking based on an ANN network, an APN network and an attention mechanism module;
s3, training the boundary frame prediction model constructed in the step S2 by adopting the training data set constructed in the step S1 to obtain a final boundary frame prediction model;
s4, inputting the template image of the target and the search image acquired by the unmanned aerial vehicle in real time into the final boundary frame prediction model obtained in the step S3 to obtain a target boundary frame;
s5, tracking the target by adopting the target boundary box obtained in the step S4;
and S6, repeating the steps S4-S5, and completing real-time tracking of the target by adopting the unmanned aerial vehicle.
The step S2 specifically comprises the following steps:
firstly, processing an input template image and a search image through a backbone network;
constructing a fourth similarity graph and a fifth similarity graph according to an output image of the backbone network, and constructing an APN network feature graph based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network;
constructing and obtaining a channel attention network characteristic diagram according to an output image of the backbone network and a pooling layer;
based on the APN network feature map and the channel attention network feature map, calculating to obtain a final classification feature map;
and carrying out classification and regression operation on the finally obtained classification feature map to obtain a final boundary frame prediction result.
The step S2 specifically comprises the following steps:
A. image of templatezAnd searching for imagesxRespectively inputting the data into a backbone network; the backbone network is an AlexNet network;
B. performing deep cross-correlation operation and convolution operation on fourth-layer output of the backbone network to obtain a fourth similarity graph
Figure SMS_1
Is->
Figure SMS_2
Wherein->
Figure SMS_3
For template imageszOutput of the fourth layer of the backbone network after input to the backbone network,>
Figure SMS_4
to search for imagesxOutput of the fourth layer of the backbone network after input to the backbone network,>
Figure SMS_5
for deep cross-correlation operations>
Figure SMS_6
Is a convolution operation;
performing convolution operation and deep cross-correlation operation on fifth-layer output of the backbone network to obtain a fifth similarity graph
Figure SMS_7
Is->
Figure SMS_8
Based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network, constructing and obtaining an APN network characteristic graph
Figure SMS_9
Is->
Figure SMS_10
Wherein->
Figure SMS_11
For the first learning weight, ++>
Figure SMS_12
For the second learning weight->
Figure SMS_13
For forward propagation network, < >>
Figure SMS_14
For the global averaging pooling operation,
Figure SMS_15
splicing the channel directions;
C. changing convolution kernel parameters of a fifth layer in the backbone network to obtain a sixth similarity graph
Figure SMS_25
The method comprises the steps of carrying out a first treatment on the surface of the At the same time, the fifth similarity map->
Figure SMS_27
Three feature maps are generated by three different convolution layers, respectively, q feature maps +.>
Figure SMS_28
K characteristic map->
Figure SMS_29
And v feature map->
Figure SMS_30
,/>
Figure SMS_32
The method comprises the steps of carrying out a first treatment on the surface of the Then will->
Figure SMS_34
And->
Figure SMS_16
The transformation scale is +.>
Figure SMS_19
Performing matrix multiplication operation on the two matrices after the scale conversion, and obtaining a space attention diagram through a softmax layer
Figure SMS_20
Softmax is an exponential normalization function; finally, a spatial attention map is obtained>
Figure SMS_22
Is that
Figure SMS_24
Wherein->
Figure SMS_26
Is a third learning weight; wherein (1)>
Figure SMS_31
Representing +.>
Figure SMS_33
Dimension vector space,/->
Figure SMS_17
Is +.>
Figure SMS_18
Dimension vector space,/->
Figure SMS_21
Is +.>
Figure SMS_23
The dimension vector space is defined by the dimension vector space,CCin order to provide the number of channels,HHin order to be of a height, the height,WWis the width;
from a spatial attention diagram
Figure SMS_36
And a pooling layer, calculating to obtain a channel attention network characteristic diagram +.>
Figure SMS_37
Is that
Figure SMS_38
Wherein->
Figure SMS_39
For the fourth learning weight->
Figure SMS_40
Is a nonlinear activation function and is calculated as +.>
Figure SMS_41
WIs an intermediate feature and->
Figure SMS_42
Figure SMS_35
Performing global maximum pooling operation;
D. network characteristic diagram based on APN
Figure SMS_43
And channel attention network profile->
Figure SMS_44
Calculating to obtain a final classification characteristic diagramRIs->
Figure SMS_45
Wherein->
Figure SMS_46
For the fifth learning weight, ++>
Figure SMS_47
Is a sixth learning weight;
E. the finally obtained fractionClass feature mapRAnd performing classification and regression operation to obtain a final boundary box prediction result.
The classification and regression operation adopts a three-branch structure; the first branch selects the bounding box with the largest intersection with the actual bounding box; the second branch selects a point on the feature map that falls within the actual bounding box; the last branch considers the center distance between each point and the actual bounding box center point; finally, different weights are introduced to balance different branches, and the total loss function is obtained as
Figure SMS_48
Is->
Figure SMS_49
Wherein->
Figure SMS_50
For cross entropy->
Figure SMS_51
Is binary cross entropy->
Figure SMS_52
、/>
Figure SMS_53
And->
Figure SMS_54
Are the respective weights of the three branches.
The invention also discloses a simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm, which comprises a physical simulation module, a binocular camera, a cloud deck and an unmanned aerial vehicle target tracking module; the output end of the physical simulation module is simultaneously connected with the input end of the binocular camera and the input end of the cradle head; the output end of the binocular camera and the output end of the cradle head are simultaneously connected with an unmanned aerial vehicle target tracking module; the physical simulation module is used for sending data information of the virtual world to the binocular camera and the cradle head; after receiving the data information of the virtual world, the binocular camera forwards the image to the unmanned aerial vehicle target tracking module; the cloud deck calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the unmanned aerial vehicle target tracking module; the unmanned aerial vehicle target tracking module is used for controlling the unmanned aerial vehicle to track the target according to the unmanned aerial vehicle target tracking method based on the deep learning algorithm.
The unmanned aerial vehicle target tracking module comprises a target boundary frame prediction module, a flight controller, a flight control information transmission module and a tracker; the output end of the binocular camera is connected with the input end of the target boundary frame prediction module, and the output end of the target boundary frame prediction module is connected with the first input end of the tracker; the output end of the cradle head is connected with the second input end of the tracker; the output end of the flight control information transmission module is connected with the third input end of the tracker; the communication end of the flight control information transmission module is connected with the communication end of the flight controller; the output end of the tracker is connected with the input end of the flight controller; after receiving the data information of the virtual world, the binocular camera forwards the image to a target boundary box prediction module; the target boundary frame prediction module is used for calculating a target boundary frame and uploading target boundary frame information to the tracker; the cradle head calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the tracker; the flight control information transmission module is used for communicating with the flight controller, and simultaneously, the flight control information transmission module sends world coordinate information of the unmanned aerial vehicle to the tracker; the tracker calculates linear speed information and angular speed information of the unmanned aerial vehicle in each direction at the next moment according to the received data information by a double-target principle, and sends the information to the flight controller; and the flight controller is used for controlling the unmanned aerial vehicle to track according to the received information.
The physical simulation module is a Gazebo simulation platform.
The flight control information transmission module is a Mavros communication module.
And the flight controller and the flight control information transmission module are communicated through a Mavlink protocol.
According to the unmanned aerial vehicle target tracking method and the simulation system based on the deep learning algorithm, through the design of the innovative target tracking method, unmanned aerial vehicle tracking of the target is realized, and the unmanned aerial vehicle target tracking method and the simulation system based on the deep learning algorithm are high in reliability, good in accuracy and high in efficiency; meanwhile, based on the simulation system provided by the invention, the complex scene adaptability of unmanned aerial vehicle target tracking can be improved.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention.
FIG. 2 is a schematic diagram of functional modules of the system of the present invention.
FIG. 3 is a schematic diagram of a target template according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of a simulation scenario and a unmanned aerial vehicle according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of adjusting a unmanned aerial vehicle to a proper position according to an embodiment of the system of the present invention.
FIG. 6 is a schematic diagram of a tracking process according to an embodiment of the present invention.
Detailed Description
A schematic process flow diagram of the method of the present invention is shown in fig. 1: the unmanned aerial vehicle target tracking method based on the deep learning algorithm provided by the invention comprises the following steps:
s1, acquiring existing image data for target tracking of an unmanned aerial vehicle, and constructing a training data set;
s2, constructing a boundary frame prediction model for unmanned aerial vehicle target tracking based on an ANN network (Artificial Neural Network ), an APN network (Anchor Proposal Network, regional advice network) and an attention mechanism module; the method specifically comprises the following steps:
firstly, processing an input template image and a search image through a backbone network;
constructing a fourth similarity graph and a fifth similarity graph according to an output image of the backbone network, and constructing an APN network feature graph based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network;
constructing and obtaining a channel attention network characteristic diagram according to an output image of the backbone network and a pooling layer;
based on the APN network feature map and the channel attention network feature map, calculating to obtain a final classification feature map;
classifying and regressing the finally obtained classification characteristic diagram to obtain a final boundary frame prediction result;
the specific implementation is realized by the following steps:
A. image of templatezAnd searching for imagesxRespectively inputting the data into a backbone network; the backbone network is an AlexNet network;
B. performing deep cross-correlation operation and convolution operation on fourth-layer output of the backbone network to obtain a fourth similarity graph
Figure SMS_55
Is->
Figure SMS_56
Wherein->
Figure SMS_57
For template imageszOutput of the fourth layer of the backbone network after input to the backbone network,>
Figure SMS_58
to search for imagesxOutput of the fourth layer of the backbone network after input to the backbone network,>
Figure SMS_59
for deep cross-correlation operations>
Figure SMS_60
Is a convolution operation; the purpose of the convolution operation is to reduce the number of channels of the two feature maps;
performing convolution operation and deep cross-correlation operation on fifth-layer output of the backbone network to obtain a fifth similarity graph
Figure SMS_61
Is->
Figure SMS_62
Based on the fourth similarity diagram, the fifth similarity diagram, the pooling layer and the forward propagation network, constructing and obtaining an APN networkFeature map
Figure SMS_63
Is->
Figure SMS_64
Wherein->
Figure SMS_65
For the first learning weight, ++>
Figure SMS_66
For the second learning weight->
Figure SMS_67
For forward propagation network, < >>
Figure SMS_68
For global average pooling operation,/->
Figure SMS_69
Splicing the channel directions; the purpose of the APN network feature map is to maintain cross interdependence similarity;
C. changing convolution kernel parameters of a fifth layer in the backbone network to obtain a sixth similarity graph
Figure SMS_78
The method comprises the steps of carrying out a first treatment on the surface of the At the same time, the fifth similarity map->
Figure SMS_80
Three feature maps are generated by three different convolution layers, respectively, q feature maps +.>
Figure SMS_82
K characteristic map->
Figure SMS_84
And v feature map->
Figure SMS_86
,/>
Figure SMS_87
The method comprises the steps of carrying out a first treatment on the surface of the Then will->
Figure SMS_88
And->
Figure SMS_70
The transformation scale is +.>
Figure SMS_73
Performing matrix multiplication operation on the two matrices after the scale conversion, and obtaining a space attention diagram through a softmax layer
Figure SMS_75
Softmax is an exponential normalization function; finally, a spatial attention map is obtained>
Figure SMS_77
Is that
Figure SMS_79
Wherein->
Figure SMS_81
Is a third learning weight; wherein (1)>
Figure SMS_83
Representing in the real number domain
Figure SMS_85
Dimension vector space,/->
Figure SMS_71
Is +.>
Figure SMS_72
Dimension vector space,/->
Figure SMS_74
Is +.>
Figure SMS_76
The dimension vector space is defined by the dimension vector space,CCin order to provide the number of channels,HHin order to be of a height, the height,WWis the width;
from a spatial attention diagram
Figure SMS_90
And a pooling layer, calculating to obtain a channel attention network characteristic diagram +.>
Figure SMS_91
Is that
Figure SMS_92
Wherein->
Figure SMS_93
For the fourth learning weight->
Figure SMS_94
Is a nonlinear activation function and is calculated as +.>
Figure SMS_95
WIs an intermediate feature and->
Figure SMS_96
Figure SMS_89
Performing global maximum pooling operation;
D. network characteristic diagram based on APN
Figure SMS_97
And channel attention network profile->
Figure SMS_98
Calculating to obtain a final classification characteristic diagramRIs->
Figure SMS_99
Wherein->
Figure SMS_100
For the fifth learning weight, ++>
Figure SMS_101
Is a sixth learning weight;
E. the finally obtained classification characteristic diagramRPerforming classification and regression operations to obtain final boundary frame predictionResults;
in specific implementation, the classification and regression operation adopts a three-branch structure; the first branch selects the bounding box with the largest intersection with the actual bounding box; the second branch selects a point on the feature map that falls within the actual bounding box; the last branch considers the center distance between each point and the actual bounding box center point; finally, different weights are introduced to balance different branches, and the total loss function is obtained as
Figure SMS_102
Is->
Figure SMS_103
Wherein->
Figure SMS_104
In order to achieve cross-entropy,
Figure SMS_105
is binary cross entropy->
Figure SMS_106
、/>
Figure SMS_107
And->
Figure SMS_108
Corresponding weights for the three branches;
s3, training the boundary frame prediction model constructed in the step S2 by adopting the training data set constructed in the step S1 to obtain a final boundary frame prediction model;
s4, inputting the template image of the target and the search image acquired by the unmanned aerial vehicle in real time into the final boundary frame prediction model obtained in the step S3 to obtain a target boundary frame;
s5, tracking the target by adopting the target boundary box obtained in the step S4;
and S6, repeating the steps S4-S5, and completing real-time tracking of the target by adopting the unmanned aerial vehicle.
FIG. 2 is a schematic diagram of functional modules of the system of the present invention: the simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm comprises a physical simulation module, a binocular camera, a cradle head and an unmanned aerial vehicle target tracking module; the output end of the physical simulation module is simultaneously connected with the input end of the binocular camera and the input end of the cradle head; the output end of the binocular camera and the output end of the cradle head are simultaneously connected with an unmanned aerial vehicle target tracking module; the physical simulation module is used for sending data information of the virtual world to the binocular camera and the cradle head; after receiving the data information of the virtual world, the binocular camera forwards the image to the unmanned aerial vehicle target tracking module; the cloud deck calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the unmanned aerial vehicle target tracking module; the unmanned aerial vehicle target tracking module is used for controlling the unmanned aerial vehicle to track the target according to the unmanned aerial vehicle target tracking method based on the deep learning algorithm.
In specific implementation, the unmanned aerial vehicle target tracking module comprises a target boundary frame prediction module, a flight controller, a flight control information transmission module and a tracker; the output end of the binocular camera is connected with the input end of the target boundary frame prediction module, and the output end of the target boundary frame prediction module is connected with the first input end of the tracker; the output end of the cradle head is connected with the second input end of the tracker; the output end of the flight control information transmission module is connected with the third input end of the tracker; the communication end of the flight control information transmission module is connected with the communication end of the flight controller; the output end of the tracker is connected with the input end of the flight controller; after receiving the data information of the virtual world, the binocular camera forwards the image to a target boundary box prediction module; the target boundary frame prediction module is used for calculating a target boundary frame and uploading target boundary frame information to the tracker; the cradle head calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the tracker; the flight control information transmission module is used for communicating with the flight controller, and simultaneously, the flight control information transmission module sends world coordinate information of the unmanned aerial vehicle to the tracker; the tracker calculates linear speed information and angular speed information of the unmanned aerial vehicle in each direction at the next moment according to the received data information by a double-target principle, and sends the information to the flight controller; and the flight controller is used for controlling the unmanned aerial vehicle to track according to the received information.
Meanwhile, the physical simulation module is a Gazebo simulation platform; the flight control information transmission module is a Mavros communication module; and the flight controller and the flight control information transmission module are communicated through a Mavlink protocol.
The simulation system provided by the invention is further described below:
a target template frame needs to be provided for the algorithm before the system is started, and in this example, the target template is shown in fig. 3;
the workflow of the simulation system is described in detail below:
firstly, virtual scene simulation and unmanned aerial vehicle construction are started, the scene after operation is shown in fig. 4, and the unmanned aerial vehicle is positioned in the center of a road; the pedestrians slowly move forwards on the left side of the unmanned plane;
then, establishing Mavros communication, establishing a keyboard by the Mavros to communicate with the unmanned aerial vehicle, and providing world coordinate information of the unmanned aerial vehicle for a tracker;
starting a keyboard control node, and controlling the unmanned aerial vehicle to fly to a proper position so that the unmanned aerial vehicle can observe a target for a long time; setting the unmanned aerial vehicle to a hover (river) state after the adjustment is completed, and closing a keyboard control node (the non-closing state possibly collides with a follow-up tracker node);
starting a cradle head control, and providing relative coordinate system information of the unmanned aerial vehicle for a tracker;
starting a tracker node, receiving information transmitted by Mavros and a cradle head, and waiting for target boundary frame information transmitted by a target boundary frame prediction module; no one has the opportunity to remain hovering (home) until the target bounding box information is received, as shown in fig. 5;
starting a target boundary frame prediction module when a target completely appears in the observation range of the unmanned aerial vehicle camera, starting to predict the target boundary frame after starting, transmitting target boundary frame information to a tracker, calculating movement speed information by the tracker according to information issued by the Mavros, the cradle head and the target boundary frame prediction module, and controlling and adjusting the unmanned aerial vehicle to fly by a flight controller according to the movement speed information;
after all nodes are started, the system starts to operate, and the process is circulated to continuously track the target; FIG. 6 is a schematic diagram of a tracking process of the system of the present invention.

Claims (9)

1. The unmanned aerial vehicle target tracking method based on the deep learning algorithm is characterized by comprising the following steps of:
s1, acquiring existing image data for target tracking of an unmanned aerial vehicle, and constructing a training data set;
s2, constructing a boundary frame prediction model for unmanned aerial vehicle target tracking based on an ANN network, an APN network and an attention mechanism module;
s3, training the boundary frame prediction model constructed in the step S2 by adopting the training data set constructed in the step S1 to obtain a final boundary frame prediction model;
s4, inputting the template image of the target and the search image acquired by the unmanned aerial vehicle in real time into the final boundary frame prediction model obtained in the step S3 to obtain a target boundary frame;
s5, tracking the target by adopting the target boundary box obtained in the step S4;
and S6, repeating the steps S4-S5, and completing real-time tracking of the target by adopting the unmanned aerial vehicle.
2. The unmanned aerial vehicle target tracking method based on the deep learning algorithm according to claim 1, wherein the step S2 specifically comprises the following steps:
firstly, processing an input template image and a search image through a backbone network;
constructing a fourth similarity graph and a fifth similarity graph according to an output image of the backbone network, and constructing an APN network feature graph based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network;
constructing and obtaining a channel attention network characteristic diagram according to an output image of the backbone network and a pooling layer;
based on the APN network feature map and the channel attention network feature map, calculating to obtain a final classification feature map;
and carrying out classification and regression operation on the finally obtained classification feature map to obtain a final boundary frame prediction result.
3. The unmanned aerial vehicle target tracking method based on the deep learning algorithm according to claim 2, wherein the step S2 specifically comprises the following steps:
A. image of templatezAnd searching for imagesxRespectively inputting the data into a backbone network; the backbone network is an AlexNet network;
B. performing deep cross-correlation operation and convolution operation on fourth-layer output of the backbone network to obtain a fourth similarity graph
Figure QLYQS_1
Is->
Figure QLYQS_2
Wherein->
Figure QLYQS_3
For template imageszOutput of the fourth layer of the backbone network after input to the backbone network,>
Figure QLYQS_4
to search for imagesxOutput of the fourth layer of the backbone network after input to the backbone network,>
Figure QLYQS_5
for deep cross-correlation operations>
Figure QLYQS_6
Is a convolution operation;
performing convolution operation and deep cross-correlation operation on fifth-layer output of the backbone network to obtain a fifth similarity graph
Figure QLYQS_7
Is that
Figure QLYQS_8
Based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network, constructing and obtaining an APN network characteristic graph
Figure QLYQS_9
Is->
Figure QLYQS_10
Wherein->
Figure QLYQS_11
For the first learning weight, ++>
Figure QLYQS_12
For the second learning weight->
Figure QLYQS_13
For forward propagation network, < >>
Figure QLYQS_14
For global average pooling operation,/->
Figure QLYQS_15
Splicing the channel directions;
C. changing convolution kernel parameters of a fifth layer in the backbone network to obtain a sixth similarity graph
Figure QLYQS_25
The method comprises the steps of carrying out a first treatment on the surface of the At the same time, the fifth similarity map->
Figure QLYQS_27
Three feature maps are generated by three different convolution layers, respectively, q feature maps +.>
Figure QLYQS_29
K characteristic map->
Figure QLYQS_31
And v feature map->
Figure QLYQS_32
,/>
Figure QLYQS_33
The method comprises the steps of carrying out a first treatment on the surface of the Then will->
Figure QLYQS_34
And->
Figure QLYQS_16
The transformation scale is +.>
Figure QLYQS_19
Performing matrix multiplication operation on the two matrices after the scale conversion, and obtaining a spatial attention force diagram through a softmax layer>
Figure QLYQS_21
Softmax is an exponential normalization function; finally, a spatial attention map is obtained>
Figure QLYQS_23
Is->
Figure QLYQS_24
Wherein
Figure QLYQS_26
Is a third learning weight; wherein (1)>
Figure QLYQS_28
Representing +.>
Figure QLYQS_30
The dimension vector space is defined by the dimension vector space,
Figure QLYQS_17
is +.>
Figure QLYQS_18
Dimension vector space,/->
Figure QLYQS_20
Is +.>
Figure QLYQS_22
The dimension vector space is defined by the dimension vector space,CCin order to provide the number of channels,HHin order to be of a height, the height,WWis the width;
from a spatial attention diagram
Figure QLYQS_36
And a pooling layer, calculating to obtain a channel attention network characteristic diagram +.>
Figure QLYQS_37
Is that
Figure QLYQS_38
Wherein->
Figure QLYQS_39
For the fourth learning weight->
Figure QLYQS_40
Is a nonlinear activation function and is calculated as +.>
Figure QLYQS_41
WIs an intermediate feature and->
Figure QLYQS_42
Figure QLYQS_35
Performing global maximum pooling operation;
D. network characteristic diagram based on APN
Figure QLYQS_43
And channel attention network profile->
Figure QLYQS_44
Calculated to obtainFinal classification feature mapRIs->
Figure QLYQS_45
Wherein->
Figure QLYQS_46
For the fifth learning weight, the first learning weight,
Figure QLYQS_47
is a sixth learning weight;
E. the finally obtained classification characteristic diagramRAnd performing classification and regression operation to obtain a final boundary box prediction result.
4. The unmanned aerial vehicle target tracking method based on the deep learning algorithm according to claim 3, wherein the classification and regression operation, in particular, the classification and regression operation adopts a three-branch structure; the first branch selects the bounding box with the largest intersection with the actual bounding box; the second branch selects a point on the feature map that falls within the actual bounding box; the last branch considers the center distance between each point and the actual bounding box center point; finally, different weights are introduced to balance different branches, and the total loss function is obtained as
Figure QLYQS_48
Is->
Figure QLYQS_49
Wherein->
Figure QLYQS_50
For cross entropy->
Figure QLYQS_51
Is binary cross entropy->
Figure QLYQS_52
、/>
Figure QLYQS_53
And->
Figure QLYQS_54
Are the respective weights of the three branches.
5. A simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm according to one of claims 1 to 4, which is characterized by comprising a physical simulation module, a binocular camera, a cloud deck and an unmanned aerial vehicle target tracking module; the output end of the physical simulation module is simultaneously connected with the input end of the binocular camera and the input end of the cradle head; the output end of the binocular camera and the output end of the cradle head are simultaneously connected with an unmanned aerial vehicle target tracking module; the physical simulation module is used for sending data information of the virtual world to the binocular camera and the cradle head; after receiving the data information of the virtual world, the binocular camera forwards the image to the unmanned aerial vehicle target tracking module; the cloud deck calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the unmanned aerial vehicle target tracking module; the unmanned aerial vehicle target tracking module is used for controlling the unmanned aerial vehicle to track the target according to the unmanned aerial vehicle target tracking method based on the deep learning algorithm.
6. The simulation system according to claim 5, wherein the unmanned aerial vehicle target tracking module comprises a target bounding box prediction module, a flight controller, a flight control information transmission module and a tracker; the output end of the binocular camera is connected with the input end of the target boundary frame prediction module, and the output end of the target boundary frame prediction module is connected with the first input end of the tracker; the output end of the cradle head is connected with the second input end of the tracker; the output end of the flight control information transmission module is connected with the third input end of the tracker; the communication end of the flight control information transmission module is connected with the communication end of the flight controller; the output end of the tracker is connected with the input end of the flight controller; after receiving the data information of the virtual world, the binocular camera forwards the image to a target boundary box prediction module; the target boundary frame prediction module is used for calculating a target boundary frame and uploading target boundary frame information to the tracker; the cradle head calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the tracker; the flight control information transmission module is used for communicating with the flight controller, and simultaneously, the flight control information transmission module sends world coordinate information of the unmanned aerial vehicle to the tracker; the tracker calculates linear speed information and angular speed information of the unmanned aerial vehicle in each direction at the next moment according to the received data information by a double-target principle, and sends the information to the flight controller; and the flight controller is used for controlling the unmanned aerial vehicle to track according to the received information.
7. The simulation system of claim 5 or 6, wherein the physical simulation module is a Gazebo simulation platform.
8. The simulation system of claim 6, wherein the flight control information transmission module is a Mavros communication module.
9. The simulation system of claim 6, wherein the flight controller and the flight control information transfer module communicate via a Mavlink protocol.
CN202310604700.0A 2023-05-26 2023-05-26 Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm Pending CN116309727A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310604700.0A CN116309727A (en) 2023-05-26 2023-05-26 Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310604700.0A CN116309727A (en) 2023-05-26 2023-05-26 Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm

Publications (1)

Publication Number Publication Date
CN116309727A true CN116309727A (en) 2023-06-23

Family

ID=86794627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310604700.0A Pending CN116309727A (en) 2023-05-26 2023-05-26 Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm

Country Status (1)

Country Link
CN (1) CN116309727A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200117937A1 (en) * 2018-10-16 2020-04-16 Samsung Electronics Co., Ltd. Convolutional neural network for object detection
KR20210099450A (en) * 2020-02-04 2021-08-12 한국해양대학교 산학협력단 Far away small drone detection method Using Deep Learning
CN113470073A (en) * 2021-07-06 2021-10-01 浙江大学 Animal center tracking method based on deep learning
CN114266805A (en) * 2021-12-31 2022-04-01 西南石油大学 Twin region suggestion network model for unmanned aerial vehicle target tracking
CN115147459A (en) * 2022-07-31 2022-10-04 哈尔滨理工大学 Unmanned aerial vehicle target tracking method based on Swin transducer
CN115578416A (en) * 2022-10-12 2023-01-06 山东大学 Unmanned aerial vehicle target tracking method, system, medium and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200117937A1 (en) * 2018-10-16 2020-04-16 Samsung Electronics Co., Ltd. Convolutional neural network for object detection
KR20210099450A (en) * 2020-02-04 2021-08-12 한국해양대학교 산학협력단 Far away small drone detection method Using Deep Learning
CN113470073A (en) * 2021-07-06 2021-10-01 浙江大学 Animal center tracking method based on deep learning
CN114266805A (en) * 2021-12-31 2022-04-01 西南石油大学 Twin region suggestion network model for unmanned aerial vehicle target tracking
CN115147459A (en) * 2022-07-31 2022-10-04 哈尔滨理工大学 Unmanned aerial vehicle target tracking method based on Swin transducer
CN115578416A (en) * 2022-10-12 2023-01-06 山东大学 Unmanned aerial vehicle target tracking method, system, medium and electronic equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHANGHONG FU等: "Siamese Anchor Proposal Network for High-Speed Aerial Tracking", ARXIV, pages 1 - 7 *
ZIANG CAO等: "SiamAPN++: Siamese Attentional Aggregation Network for Real-Time UAV Tracking", ARXIV, pages 1 - 7 *
刘奇胜: "基于视觉的四旋翼无人机目标跟踪系统的设计与实现", 中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑, no. 01, pages 12 - 22 *

Similar Documents

Publication Publication Date Title
KR102306939B1 (en) Method and device for short-term path planning of autonomous driving through information fusion by using v2x communication and image processing
CN106873585B (en) A kind of navigation method for searching, robot and system
EP3405845B1 (en) Object-focused active three-dimensional reconstruction
WO2022100107A1 (en) Methods and systems for predicting dynamic object behavior
CN111340868B (en) Unmanned underwater vehicle autonomous decision control method based on visual depth estimation
CN111709410B (en) Behavior identification method for strong dynamic video
CN110908399B (en) Unmanned aerial vehicle autonomous obstacle avoidance method and system based on lightweight neural network
CN111368755A (en) Vision-based pedestrian autonomous following method for quadruped robot
CN114237235B (en) Mobile robot obstacle avoidance method based on deep reinforcement learning
CN112651374B (en) Future trajectory prediction method based on social information and automatic driving system
CN108320051B (en) Mobile robot dynamic collision avoidance planning method based on GRU network model
CN106973221A (en) Unmanned plane image capture method and system based on aesthetic evaluation
CN113189983A (en) Open scene-oriented multi-robot cooperative multi-target sampling method
CN112114592B (en) Method for realizing autonomous crossing of movable frame-shaped barrier by unmanned aerial vehicle
CN114719848A (en) Unmanned aerial vehicle height estimation method based on neural network fused with visual and inertial navigation information
CN112863186A (en) Vehicle-mounted unmanned aerial vehicle-based escaping vehicle rapid identification and tracking method
Zhang et al. A convolutional neural network method for self-driving cars
CN116309727A (en) Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm
CN114326821B (en) Unmanned aerial vehicle autonomous obstacle avoidance system and method based on deep reinforcement learning
Khalil et al. Integration of motion prediction with end-to-end latent RL for self-driving vehicles
CN116385909A (en) Unmanned aerial vehicle target tracking method based on deep reinforcement learning
CN114779821B (en) Unmanned aerial vehicle self-adaptive repulsive force coefficient path planning method based on deep learning
CN114518762B (en) Robot obstacle avoidance device, obstacle avoidance control method and robot
CN112857373B (en) Energy-saving unmanned vehicle path navigation method capable of minimizing useless actions
CN114326826B (en) Multi-unmanned aerial vehicle formation transformation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20230623

RJ01 Rejection of invention patent application after publication