CN116309727A - Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm - Google Patents
Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm Download PDFInfo
- Publication number
- CN116309727A CN116309727A CN202310604700.0A CN202310604700A CN116309727A CN 116309727 A CN116309727 A CN 116309727A CN 202310604700 A CN202310604700 A CN 202310604700A CN 116309727 A CN116309727 A CN 116309727A
- Authority
- CN
- China
- Prior art keywords
- unmanned aerial
- aerial vehicle
- target
- module
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000004088 simulation Methods 0.000 title claims abstract description 36
- 238000013135 deep learning Methods 0.000 title claims abstract description 21
- 238000012549 training Methods 0.000 claims abstract description 11
- 238000010586 diagram Methods 0.000 claims description 30
- RZVHIXYEVGDQDX-UHFFFAOYSA-N 9,10-anthraquinone Chemical compound C1=CC=C2C(=O)C3=CC=CC=C3C(=O)C2=C1 RZVHIXYEVGDQDX-UHFFFAOYSA-N 0.000 claims description 21
- 230000005540 biological transmission Effects 0.000 claims description 20
- 238000011176 pooling Methods 0.000 claims description 18
- 238000004891 communication Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 9
- 101150023426 Ccin gene Proteins 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 241001235534 Graphis <ascomycete fungus> Species 0.000 claims 1
- 238000012546 transfer Methods 0.000 claims 1
- 238000013461 design Methods 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 13
- 238000012360 testing method Methods 0.000 description 5
- 230000033001 locomotion Effects 0.000 description 4
- 238000012935 Averaging Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/62—Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/17—Terrestrial scenes taken from planes or by drones
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Remote Sensing (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an unmanned aerial vehicle target tracking method based on a deep learning algorithm, which comprises the steps of obtaining existing image data for unmanned aerial vehicle target tracking and constructing a training data set; constructing a boundary frame prediction model for unmanned aerial vehicle target tracking and training to obtain a final boundary frame prediction model; inputting a template image of a target and a search image acquired by the unmanned aerial vehicle in real time into a boundary frame prediction model to obtain a target boundary frame; tracking the target by adopting a target boundary box; repeating the steps, and adopting the unmanned aerial vehicle to complete real-time tracking of the target. The invention also discloses a simulation system comprising the unmanned aerial vehicle target tracking method based on the deep learning algorithm. According to the invention, through the design of an innovative target tracking method, the unmanned aerial vehicle tracking of the target is realized, and the reliability is high, the accuracy is good and the efficiency is high; meanwhile, based on the simulation system provided by the invention, the complex scene adaptability of unmanned aerial vehicle target tracking can be improved.
Description
Technical Field
The invention belongs to the technical field of target tracking, and particularly relates to an unmanned aerial vehicle target tracking method and a simulation system based on a deep learning algorithm.
Background
Along with the development of economic technology, unmanned aerial vehicles are widely applied to the production and life of people, and bring endless convenience to the production and life of people. Unmanned aerial vehicle target tracking is the hotspot problem of unmanned aerial vehicle application field, and its aim at control unmanned aerial vehicle is continuous to target tracking to realize specific purpose, such as target tracking, demarcation of target motion track, target follow-up beat etc..
In the field of unmanned aerial vehicle target tracking, two main problems exist at present: firstly, in the target tracking process of the unmanned aerial vehicle, the conditions of rapid movement, scale change, severe deformation, target shielding and the like of the target greatly interfere with the target tracking process of the unmanned aerial vehicle, and the tracking difficulty and the tracking process accuracy are accelerated; secondly, in the tracking algorithm test (or tracking algorithm trial-and-error) process of the unmanned aerial vehicle in the target tracking process, the test cost of the unmanned aerial vehicle, especially the trial-and-error cost is high.
At present, in the target tracking process of the traditional unmanned aerial vehicle, the characteristics of a target need to be manually extracted firstly; the process is time-consuming and labor-consuming, and a series of problems such as weak anti-interference capability, poor real-time performance, easy target loss and the like can be caused in the implementation process; this makes the reliability and accuracy of existing drone target tracking schemes poor. Moreover, at present, the traditional unmanned aerial vehicle test or trial-and-error process lacks a corresponding and reliable simulation system, and the feasibility test difficulty and the test cost of the unmanned aerial vehicle tracking scheme are greatly increased.
Disclosure of Invention
The invention aims to provide an unmanned aerial vehicle target tracking method based on a deep learning algorithm, which is high in reliability, accuracy and efficiency.
The second object of the invention is to provide a simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm.
The unmanned aerial vehicle target tracking method based on the deep learning algorithm provided by the invention comprises the following steps:
s1, acquiring existing image data for target tracking of an unmanned aerial vehicle, and constructing a training data set;
s2, constructing a boundary frame prediction model for unmanned aerial vehicle target tracking based on an ANN network, an APN network and an attention mechanism module;
s3, training the boundary frame prediction model constructed in the step S2 by adopting the training data set constructed in the step S1 to obtain a final boundary frame prediction model;
s4, inputting the template image of the target and the search image acquired by the unmanned aerial vehicle in real time into the final boundary frame prediction model obtained in the step S3 to obtain a target boundary frame;
s5, tracking the target by adopting the target boundary box obtained in the step S4;
and S6, repeating the steps S4-S5, and completing real-time tracking of the target by adopting the unmanned aerial vehicle.
The step S2 specifically comprises the following steps:
firstly, processing an input template image and a search image through a backbone network;
constructing a fourth similarity graph and a fifth similarity graph according to an output image of the backbone network, and constructing an APN network feature graph based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network;
constructing and obtaining a channel attention network characteristic diagram according to an output image of the backbone network and a pooling layer;
based on the APN network feature map and the channel attention network feature map, calculating to obtain a final classification feature map;
and carrying out classification and regression operation on the finally obtained classification feature map to obtain a final boundary frame prediction result.
The step S2 specifically comprises the following steps:
A. image of templatezAnd searching for imagesxRespectively inputting the data into a backbone network; the backbone network is an AlexNet network;
B. performing deep cross-correlation operation and convolution operation on fourth-layer output of the backbone network to obtain a fourth similarity graphIs->Wherein->For template imageszOutput of the fourth layer of the backbone network after input to the backbone network,>to search for imagesxOutput of the fourth layer of the backbone network after input to the backbone network,>for deep cross-correlation operations>Is a convolution operation;
performing convolution operation and deep cross-correlation operation on fifth-layer output of the backbone network to obtain a fifth similarity graphIs->;
Based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network, constructing and obtaining an APN network characteristic graphIs->Wherein->For the first learning weight, ++>For the second learning weight->For forward propagation network, < >>For the global averaging pooling operation,splicing the channel directions;
C. changing convolution kernel parameters of a fifth layer in the backbone network to obtain a sixth similarity graphThe method comprises the steps of carrying out a first treatment on the surface of the At the same time, the fifth similarity map->Three feature maps are generated by three different convolution layers, respectively, q feature maps +.>K characteristic map->And v feature map->,/>The method comprises the steps of carrying out a first treatment on the surface of the Then will->And->The transformation scale is +.>Performing matrix multiplication operation on the two matrices after the scale conversion, and obtaining a space attention diagram through a softmax layerSoftmax is an exponential normalization function; finally, a spatial attention map is obtained>Is thatWherein->Is a third learning weight; wherein (1)>Representing +.>Dimension vector space,/->Is +.>Dimension vector space,/->Is +.>The dimension vector space is defined by the dimension vector space,CCin order to provide the number of channels,HHin order to be of a height, the height,WWis the width;
from a spatial attention diagramAnd a pooling layer, calculating to obtain a channel attention network characteristic diagram +.>Is thatWherein->For the fourth learning weight->Is a nonlinear activation function and is calculated as +.>,WIs an intermediate feature and->,Performing global maximum pooling operation;
D. network characteristic diagram based on APNAnd channel attention network profile->Calculating to obtain a final classification characteristic diagramRIs->Wherein->For the fifth learning weight, ++>Is a sixth learning weight;
E. the finally obtained fractionClass feature mapRAnd performing classification and regression operation to obtain a final boundary box prediction result.
The classification and regression operation adopts a three-branch structure; the first branch selects the bounding box with the largest intersection with the actual bounding box; the second branch selects a point on the feature map that falls within the actual bounding box; the last branch considers the center distance between each point and the actual bounding box center point; finally, different weights are introduced to balance different branches, and the total loss function is obtained asIs->Wherein->For cross entropy->Is binary cross entropy->、/>And->Are the respective weights of the three branches.
The invention also discloses a simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm, which comprises a physical simulation module, a binocular camera, a cloud deck and an unmanned aerial vehicle target tracking module; the output end of the physical simulation module is simultaneously connected with the input end of the binocular camera and the input end of the cradle head; the output end of the binocular camera and the output end of the cradle head are simultaneously connected with an unmanned aerial vehicle target tracking module; the physical simulation module is used for sending data information of the virtual world to the binocular camera and the cradle head; after receiving the data information of the virtual world, the binocular camera forwards the image to the unmanned aerial vehicle target tracking module; the cloud deck calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the unmanned aerial vehicle target tracking module; the unmanned aerial vehicle target tracking module is used for controlling the unmanned aerial vehicle to track the target according to the unmanned aerial vehicle target tracking method based on the deep learning algorithm.
The unmanned aerial vehicle target tracking module comprises a target boundary frame prediction module, a flight controller, a flight control information transmission module and a tracker; the output end of the binocular camera is connected with the input end of the target boundary frame prediction module, and the output end of the target boundary frame prediction module is connected with the first input end of the tracker; the output end of the cradle head is connected with the second input end of the tracker; the output end of the flight control information transmission module is connected with the third input end of the tracker; the communication end of the flight control information transmission module is connected with the communication end of the flight controller; the output end of the tracker is connected with the input end of the flight controller; after receiving the data information of the virtual world, the binocular camera forwards the image to a target boundary box prediction module; the target boundary frame prediction module is used for calculating a target boundary frame and uploading target boundary frame information to the tracker; the cradle head calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the tracker; the flight control information transmission module is used for communicating with the flight controller, and simultaneously, the flight control information transmission module sends world coordinate information of the unmanned aerial vehicle to the tracker; the tracker calculates linear speed information and angular speed information of the unmanned aerial vehicle in each direction at the next moment according to the received data information by a double-target principle, and sends the information to the flight controller; and the flight controller is used for controlling the unmanned aerial vehicle to track according to the received information.
The physical simulation module is a Gazebo simulation platform.
The flight control information transmission module is a Mavros communication module.
And the flight controller and the flight control information transmission module are communicated through a Mavlink protocol.
According to the unmanned aerial vehicle target tracking method and the simulation system based on the deep learning algorithm, through the design of the innovative target tracking method, unmanned aerial vehicle tracking of the target is realized, and the unmanned aerial vehicle target tracking method and the simulation system based on the deep learning algorithm are high in reliability, good in accuracy and high in efficiency; meanwhile, based on the simulation system provided by the invention, the complex scene adaptability of unmanned aerial vehicle target tracking can be improved.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention.
FIG. 2 is a schematic diagram of functional modules of the system of the present invention.
FIG. 3 is a schematic diagram of a target template according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of a simulation scenario and a unmanned aerial vehicle according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of adjusting a unmanned aerial vehicle to a proper position according to an embodiment of the system of the present invention.
FIG. 6 is a schematic diagram of a tracking process according to an embodiment of the present invention.
Detailed Description
A schematic process flow diagram of the method of the present invention is shown in fig. 1: the unmanned aerial vehicle target tracking method based on the deep learning algorithm provided by the invention comprises the following steps:
s1, acquiring existing image data for target tracking of an unmanned aerial vehicle, and constructing a training data set;
s2, constructing a boundary frame prediction model for unmanned aerial vehicle target tracking based on an ANN network (Artificial Neural Network ), an APN network (Anchor Proposal Network, regional advice network) and an attention mechanism module; the method specifically comprises the following steps:
firstly, processing an input template image and a search image through a backbone network;
constructing a fourth similarity graph and a fifth similarity graph according to an output image of the backbone network, and constructing an APN network feature graph based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network;
constructing and obtaining a channel attention network characteristic diagram according to an output image of the backbone network and a pooling layer;
based on the APN network feature map and the channel attention network feature map, calculating to obtain a final classification feature map;
classifying and regressing the finally obtained classification characteristic diagram to obtain a final boundary frame prediction result;
the specific implementation is realized by the following steps:
A. image of templatezAnd searching for imagesxRespectively inputting the data into a backbone network; the backbone network is an AlexNet network;
B. performing deep cross-correlation operation and convolution operation on fourth-layer output of the backbone network to obtain a fourth similarity graphIs->Wherein->For template imageszOutput of the fourth layer of the backbone network after input to the backbone network,>to search for imagesxOutput of the fourth layer of the backbone network after input to the backbone network,>for deep cross-correlation operations>Is a convolution operation; the purpose of the convolution operation is to reduce the number of channels of the two feature maps;
performing convolution operation and deep cross-correlation operation on fifth-layer output of the backbone network to obtain a fifth similarity graphIs->;
Based on the fourth similarity diagram, the fifth similarity diagram, the pooling layer and the forward propagation network, constructing and obtaining an APN networkFeature mapIs->Wherein->For the first learning weight, ++>For the second learning weight->For forward propagation network, < >>For global average pooling operation,/->Splicing the channel directions; the purpose of the APN network feature map is to maintain cross interdependence similarity;
C. changing convolution kernel parameters of a fifth layer in the backbone network to obtain a sixth similarity graphThe method comprises the steps of carrying out a first treatment on the surface of the At the same time, the fifth similarity map->Three feature maps are generated by three different convolution layers, respectively, q feature maps +.>K characteristic map->And v feature map->,/>The method comprises the steps of carrying out a first treatment on the surface of the Then will->And->The transformation scale is +.>Performing matrix multiplication operation on the two matrices after the scale conversion, and obtaining a space attention diagram through a softmax layerSoftmax is an exponential normalization function; finally, a spatial attention map is obtained>Is thatWherein->Is a third learning weight; wherein (1)>Representing in the real number domainDimension vector space,/->Is +.>Dimension vector space,/->Is +.>The dimension vector space is defined by the dimension vector space,CCin order to provide the number of channels,HHin order to be of a height, the height,WWis the width;
from a spatial attention diagramAnd a pooling layer, calculating to obtain a channel attention network characteristic diagram +.>Is thatWherein->For the fourth learning weight->Is a nonlinear activation function and is calculated as +.>,WIs an intermediate feature and->,Performing global maximum pooling operation;
D. network characteristic diagram based on APNAnd channel attention network profile->Calculating to obtain a final classification characteristic diagramRIs->Wherein->For the fifth learning weight, ++>Is a sixth learning weight;
E. the finally obtained classification characteristic diagramRPerforming classification and regression operations to obtain final boundary frame predictionResults;
in specific implementation, the classification and regression operation adopts a three-branch structure; the first branch selects the bounding box with the largest intersection with the actual bounding box; the second branch selects a point on the feature map that falls within the actual bounding box; the last branch considers the center distance between each point and the actual bounding box center point; finally, different weights are introduced to balance different branches, and the total loss function is obtained asIs->Wherein->In order to achieve cross-entropy,is binary cross entropy->、/>And->Corresponding weights for the three branches;
s3, training the boundary frame prediction model constructed in the step S2 by adopting the training data set constructed in the step S1 to obtain a final boundary frame prediction model;
s4, inputting the template image of the target and the search image acquired by the unmanned aerial vehicle in real time into the final boundary frame prediction model obtained in the step S3 to obtain a target boundary frame;
s5, tracking the target by adopting the target boundary box obtained in the step S4;
and S6, repeating the steps S4-S5, and completing real-time tracking of the target by adopting the unmanned aerial vehicle.
FIG. 2 is a schematic diagram of functional modules of the system of the present invention: the simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm comprises a physical simulation module, a binocular camera, a cradle head and an unmanned aerial vehicle target tracking module; the output end of the physical simulation module is simultaneously connected with the input end of the binocular camera and the input end of the cradle head; the output end of the binocular camera and the output end of the cradle head are simultaneously connected with an unmanned aerial vehicle target tracking module; the physical simulation module is used for sending data information of the virtual world to the binocular camera and the cradle head; after receiving the data information of the virtual world, the binocular camera forwards the image to the unmanned aerial vehicle target tracking module; the cloud deck calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the unmanned aerial vehicle target tracking module; the unmanned aerial vehicle target tracking module is used for controlling the unmanned aerial vehicle to track the target according to the unmanned aerial vehicle target tracking method based on the deep learning algorithm.
In specific implementation, the unmanned aerial vehicle target tracking module comprises a target boundary frame prediction module, a flight controller, a flight control information transmission module and a tracker; the output end of the binocular camera is connected with the input end of the target boundary frame prediction module, and the output end of the target boundary frame prediction module is connected with the first input end of the tracker; the output end of the cradle head is connected with the second input end of the tracker; the output end of the flight control information transmission module is connected with the third input end of the tracker; the communication end of the flight control information transmission module is connected with the communication end of the flight controller; the output end of the tracker is connected with the input end of the flight controller; after receiving the data information of the virtual world, the binocular camera forwards the image to a target boundary box prediction module; the target boundary frame prediction module is used for calculating a target boundary frame and uploading target boundary frame information to the tracker; the cradle head calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the tracker; the flight control information transmission module is used for communicating with the flight controller, and simultaneously, the flight control information transmission module sends world coordinate information of the unmanned aerial vehicle to the tracker; the tracker calculates linear speed information and angular speed information of the unmanned aerial vehicle in each direction at the next moment according to the received data information by a double-target principle, and sends the information to the flight controller; and the flight controller is used for controlling the unmanned aerial vehicle to track according to the received information.
Meanwhile, the physical simulation module is a Gazebo simulation platform; the flight control information transmission module is a Mavros communication module; and the flight controller and the flight control information transmission module are communicated through a Mavlink protocol.
The simulation system provided by the invention is further described below:
a target template frame needs to be provided for the algorithm before the system is started, and in this example, the target template is shown in fig. 3;
the workflow of the simulation system is described in detail below:
firstly, virtual scene simulation and unmanned aerial vehicle construction are started, the scene after operation is shown in fig. 4, and the unmanned aerial vehicle is positioned in the center of a road; the pedestrians slowly move forwards on the left side of the unmanned plane;
then, establishing Mavros communication, establishing a keyboard by the Mavros to communicate with the unmanned aerial vehicle, and providing world coordinate information of the unmanned aerial vehicle for a tracker;
starting a keyboard control node, and controlling the unmanned aerial vehicle to fly to a proper position so that the unmanned aerial vehicle can observe a target for a long time; setting the unmanned aerial vehicle to a hover (river) state after the adjustment is completed, and closing a keyboard control node (the non-closing state possibly collides with a follow-up tracker node);
starting a cradle head control, and providing relative coordinate system information of the unmanned aerial vehicle for a tracker;
starting a tracker node, receiving information transmitted by Mavros and a cradle head, and waiting for target boundary frame information transmitted by a target boundary frame prediction module; no one has the opportunity to remain hovering (home) until the target bounding box information is received, as shown in fig. 5;
starting a target boundary frame prediction module when a target completely appears in the observation range of the unmanned aerial vehicle camera, starting to predict the target boundary frame after starting, transmitting target boundary frame information to a tracker, calculating movement speed information by the tracker according to information issued by the Mavros, the cradle head and the target boundary frame prediction module, and controlling and adjusting the unmanned aerial vehicle to fly by a flight controller according to the movement speed information;
after all nodes are started, the system starts to operate, and the process is circulated to continuously track the target; FIG. 6 is a schematic diagram of a tracking process of the system of the present invention.
Claims (9)
1. The unmanned aerial vehicle target tracking method based on the deep learning algorithm is characterized by comprising the following steps of:
s1, acquiring existing image data for target tracking of an unmanned aerial vehicle, and constructing a training data set;
s2, constructing a boundary frame prediction model for unmanned aerial vehicle target tracking based on an ANN network, an APN network and an attention mechanism module;
s3, training the boundary frame prediction model constructed in the step S2 by adopting the training data set constructed in the step S1 to obtain a final boundary frame prediction model;
s4, inputting the template image of the target and the search image acquired by the unmanned aerial vehicle in real time into the final boundary frame prediction model obtained in the step S3 to obtain a target boundary frame;
s5, tracking the target by adopting the target boundary box obtained in the step S4;
and S6, repeating the steps S4-S5, and completing real-time tracking of the target by adopting the unmanned aerial vehicle.
2. The unmanned aerial vehicle target tracking method based on the deep learning algorithm according to claim 1, wherein the step S2 specifically comprises the following steps:
firstly, processing an input template image and a search image through a backbone network;
constructing a fourth similarity graph and a fifth similarity graph according to an output image of the backbone network, and constructing an APN network feature graph based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network;
constructing and obtaining a channel attention network characteristic diagram according to an output image of the backbone network and a pooling layer;
based on the APN network feature map and the channel attention network feature map, calculating to obtain a final classification feature map;
and carrying out classification and regression operation on the finally obtained classification feature map to obtain a final boundary frame prediction result.
3. The unmanned aerial vehicle target tracking method based on the deep learning algorithm according to claim 2, wherein the step S2 specifically comprises the following steps:
A. image of templatezAnd searching for imagesxRespectively inputting the data into a backbone network; the backbone network is an AlexNet network;
B. performing deep cross-correlation operation and convolution operation on fourth-layer output of the backbone network to obtain a fourth similarity graphIs->Wherein->For template imageszOutput of the fourth layer of the backbone network after input to the backbone network,>to search for imagesxOutput of the fourth layer of the backbone network after input to the backbone network,>for deep cross-correlation operations>Is a convolution operation;
performing convolution operation and deep cross-correlation operation on fifth-layer output of the backbone network to obtain a fifth similarity graphIs that;
Based on the fourth similarity graph, the fifth similarity graph, the pooling layer and the forward propagation network, constructing and obtaining an APN network characteristic graphIs->Wherein->For the first learning weight, ++>For the second learning weight->For forward propagation network, < >>For global average pooling operation,/->Splicing the channel directions;
C. changing convolution kernel parameters of a fifth layer in the backbone network to obtain a sixth similarity graphThe method comprises the steps of carrying out a first treatment on the surface of the At the same time, the fifth similarity map->Three feature maps are generated by three different convolution layers, respectively, q feature maps +.>K characteristic map->And v feature map->,/>The method comprises the steps of carrying out a first treatment on the surface of the Then will->And->The transformation scale is +.>Performing matrix multiplication operation on the two matrices after the scale conversion, and obtaining a spatial attention force diagram through a softmax layer>Softmax is an exponential normalization function; finally, a spatial attention map is obtained>Is->WhereinIs a third learning weight; wherein (1)>Representing +.>The dimension vector space is defined by the dimension vector space,is +.>Dimension vector space,/->Is +.>The dimension vector space is defined by the dimension vector space,CCin order to provide the number of channels,HHin order to be of a height, the height,WWis the width;
from a spatial attention diagramAnd a pooling layer, calculating to obtain a channel attention network characteristic diagram +.>Is thatWherein->For the fourth learning weight->Is a nonlinear activation function and is calculated as +.>,WIs an intermediate feature and->,Performing global maximum pooling operation;
D. network characteristic diagram based on APNAnd channel attention network profile->Calculated to obtainFinal classification feature mapRIs->Wherein->For the fifth learning weight, the first learning weight,is a sixth learning weight;
E. the finally obtained classification characteristic diagramRAnd performing classification and regression operation to obtain a final boundary box prediction result.
4. The unmanned aerial vehicle target tracking method based on the deep learning algorithm according to claim 3, wherein the classification and regression operation, in particular, the classification and regression operation adopts a three-branch structure; the first branch selects the bounding box with the largest intersection with the actual bounding box; the second branch selects a point on the feature map that falls within the actual bounding box; the last branch considers the center distance between each point and the actual bounding box center point; finally, different weights are introduced to balance different branches, and the total loss function is obtained asIs->Wherein->For cross entropy->Is binary cross entropy->、/>And->Are the respective weights of the three branches.
5. A simulation system for realizing the unmanned aerial vehicle target tracking method based on the deep learning algorithm according to one of claims 1 to 4, which is characterized by comprising a physical simulation module, a binocular camera, a cloud deck and an unmanned aerial vehicle target tracking module; the output end of the physical simulation module is simultaneously connected with the input end of the binocular camera and the input end of the cradle head; the output end of the binocular camera and the output end of the cradle head are simultaneously connected with an unmanned aerial vehicle target tracking module; the physical simulation module is used for sending data information of the virtual world to the binocular camera and the cradle head; after receiving the data information of the virtual world, the binocular camera forwards the image to the unmanned aerial vehicle target tracking module; the cloud deck calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the unmanned aerial vehicle target tracking module; the unmanned aerial vehicle target tracking module is used for controlling the unmanned aerial vehicle to track the target according to the unmanned aerial vehicle target tracking method based on the deep learning algorithm.
6. The simulation system according to claim 5, wherein the unmanned aerial vehicle target tracking module comprises a target bounding box prediction module, a flight controller, a flight control information transmission module and a tracker; the output end of the binocular camera is connected with the input end of the target boundary frame prediction module, and the output end of the target boundary frame prediction module is connected with the first input end of the tracker; the output end of the cradle head is connected with the second input end of the tracker; the output end of the flight control information transmission module is connected with the third input end of the tracker; the communication end of the flight control information transmission module is connected with the communication end of the flight controller; the output end of the tracker is connected with the input end of the flight controller; after receiving the data information of the virtual world, the binocular camera forwards the image to a target boundary box prediction module; the target boundary frame prediction module is used for calculating a target boundary frame and uploading target boundary frame information to the tracker; the cradle head calculates to obtain relative coordinates of the camera according to the received data information, and sends the relative coordinates of the camera to the tracker; the flight control information transmission module is used for communicating with the flight controller, and simultaneously, the flight control information transmission module sends world coordinate information of the unmanned aerial vehicle to the tracker; the tracker calculates linear speed information and angular speed information of the unmanned aerial vehicle in each direction at the next moment according to the received data information by a double-target principle, and sends the information to the flight controller; and the flight controller is used for controlling the unmanned aerial vehicle to track according to the received information.
7. The simulation system of claim 5 or 6, wherein the physical simulation module is a Gazebo simulation platform.
8. The simulation system of claim 6, wherein the flight control information transmission module is a Mavros communication module.
9. The simulation system of claim 6, wherein the flight controller and the flight control information transfer module communicate via a Mavlink protocol.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310604700.0A CN116309727A (en) | 2023-05-26 | 2023-05-26 | Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310604700.0A CN116309727A (en) | 2023-05-26 | 2023-05-26 | Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116309727A true CN116309727A (en) | 2023-06-23 |
Family
ID=86794627
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310604700.0A Pending CN116309727A (en) | 2023-05-26 | 2023-05-26 | Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116309727A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200117937A1 (en) * | 2018-10-16 | 2020-04-16 | Samsung Electronics Co., Ltd. | Convolutional neural network for object detection |
KR20210099450A (en) * | 2020-02-04 | 2021-08-12 | 한국해양대학교 산학협력단 | Far away small drone detection method Using Deep Learning |
CN113470073A (en) * | 2021-07-06 | 2021-10-01 | 浙江大学 | Animal center tracking method based on deep learning |
CN114266805A (en) * | 2021-12-31 | 2022-04-01 | 西南石油大学 | Twin region suggestion network model for unmanned aerial vehicle target tracking |
CN115147459A (en) * | 2022-07-31 | 2022-10-04 | 哈尔滨理工大学 | Unmanned aerial vehicle target tracking method based on Swin transducer |
CN115578416A (en) * | 2022-10-12 | 2023-01-06 | 山东大学 | Unmanned aerial vehicle target tracking method, system, medium and electronic equipment |
-
2023
- 2023-05-26 CN CN202310604700.0A patent/CN116309727A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200117937A1 (en) * | 2018-10-16 | 2020-04-16 | Samsung Electronics Co., Ltd. | Convolutional neural network for object detection |
KR20210099450A (en) * | 2020-02-04 | 2021-08-12 | 한국해양대학교 산학협력단 | Far away small drone detection method Using Deep Learning |
CN113470073A (en) * | 2021-07-06 | 2021-10-01 | 浙江大学 | Animal center tracking method based on deep learning |
CN114266805A (en) * | 2021-12-31 | 2022-04-01 | 西南石油大学 | Twin region suggestion network model for unmanned aerial vehicle target tracking |
CN115147459A (en) * | 2022-07-31 | 2022-10-04 | 哈尔滨理工大学 | Unmanned aerial vehicle target tracking method based on Swin transducer |
CN115578416A (en) * | 2022-10-12 | 2023-01-06 | 山东大学 | Unmanned aerial vehicle target tracking method, system, medium and electronic equipment |
Non-Patent Citations (3)
Title |
---|
CHANGHONG FU等: "Siamese Anchor Proposal Network for High-Speed Aerial Tracking", ARXIV, pages 1 - 7 * |
ZIANG CAO等: "SiamAPN++: Siamese Attentional Aggregation Network for Real-Time UAV Tracking", ARXIV, pages 1 - 7 * |
刘奇胜: "基于视觉的四旋翼无人机目标跟踪系统的设计与实现", 中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑, no. 01, pages 12 - 22 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102306939B1 (en) | Method and device for short-term path planning of autonomous driving through information fusion by using v2x communication and image processing | |
CN106873585B (en) | A kind of navigation method for searching, robot and system | |
EP3405845B1 (en) | Object-focused active three-dimensional reconstruction | |
WO2022100107A1 (en) | Methods and systems for predicting dynamic object behavior | |
CN111340868B (en) | Unmanned underwater vehicle autonomous decision control method based on visual depth estimation | |
CN111709410B (en) | Behavior identification method for strong dynamic video | |
CN110908399B (en) | Unmanned aerial vehicle autonomous obstacle avoidance method and system based on lightweight neural network | |
CN111368755A (en) | Vision-based pedestrian autonomous following method for quadruped robot | |
CN114237235B (en) | Mobile robot obstacle avoidance method based on deep reinforcement learning | |
CN112651374B (en) | Future trajectory prediction method based on social information and automatic driving system | |
CN108320051B (en) | Mobile robot dynamic collision avoidance planning method based on GRU network model | |
CN106973221A (en) | Unmanned plane image capture method and system based on aesthetic evaluation | |
CN113189983A (en) | Open scene-oriented multi-robot cooperative multi-target sampling method | |
CN112114592B (en) | Method for realizing autonomous crossing of movable frame-shaped barrier by unmanned aerial vehicle | |
CN114719848A (en) | Unmanned aerial vehicle height estimation method based on neural network fused with visual and inertial navigation information | |
CN112863186A (en) | Vehicle-mounted unmanned aerial vehicle-based escaping vehicle rapid identification and tracking method | |
Zhang et al. | A convolutional neural network method for self-driving cars | |
CN116309727A (en) | Unmanned aerial vehicle target tracking method and simulation system based on deep learning algorithm | |
CN114326821B (en) | Unmanned aerial vehicle autonomous obstacle avoidance system and method based on deep reinforcement learning | |
Khalil et al. | Integration of motion prediction with end-to-end latent RL for self-driving vehicles | |
CN116385909A (en) | Unmanned aerial vehicle target tracking method based on deep reinforcement learning | |
CN114779821B (en) | Unmanned aerial vehicle self-adaptive repulsive force coefficient path planning method based on deep learning | |
CN114518762B (en) | Robot obstacle avoidance device, obstacle avoidance control method and robot | |
CN112857373B (en) | Energy-saving unmanned vehicle path navigation method capable of minimizing useless actions | |
CN114326826B (en) | Multi-unmanned aerial vehicle formation transformation method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20230623 |
|
RJ01 | Rejection of invention patent application after publication |