CN116385498A - Target tracking method and system based on artificial intelligence - Google Patents

Target tracking method and system based on artificial intelligence Download PDF

Info

Publication number
CN116385498A
CN116385498A CN202310653926.XA CN202310653926A CN116385498A CN 116385498 A CN116385498 A CN 116385498A CN 202310653926 A CN202310653926 A CN 202310653926A CN 116385498 A CN116385498 A CN 116385498A
Authority
CN
China
Prior art keywords
target
frame
representing
detection
tracking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310653926.XA
Other languages
Chinese (zh)
Inventor
钟为金
崔雄文
王维
路航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dfine Technology Co Ltd
Original Assignee
Dfine Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dfine Technology Co Ltd filed Critical Dfine Technology Co Ltd
Priority to CN202310653926.XA priority Critical patent/CN116385498A/en
Publication of CN116385498A publication Critical patent/CN116385498A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Radar Systems Or Details Thereof (AREA)

Abstract

The invention relates to the technical field of target tracking, and discloses a target tracking method and system based on artificial intelligence, wherein the method is used for carrying out fusion target tracking based on a YOLOv4 target detection algorithm and a KCF rapid tracking algorithm: performing target detection by using a YOLOv4 algorithm; and (3) acquiring a target prediction position by adopting a KCF algorithm, and simultaneously taking the current frame as an input image of a YOLOv4 target detection model to perform target retrieval to obtain accurate target detection position and scale information so as to realize target tracking. The invention solves the problems of low tracking success rate, low efficiency and the like in the prior art.

Description

Target tracking method and system based on artificial intelligence
Technical Field
The invention relates to the technical field of target tracking, in particular to a target tracking method and system based on artificial intelligence.
Background
Currently, common target tracking algorithms are YOLOv4 algorithm (deep learning regression detection algorithm) and KCF algorithm (kernel-related filter tracking algorithm). The YOLOv4 algorithm has good detection and tracking capability on complete targets in a simple scene, has strong robustness on scale change, deformation and the like, can solve the problems of target shielding and high-speed maneuvering, but can only detect and track known targets, and the detection and tracking effect is to be improved under the conditions of long distance, small targets and unobvious target characteristics. However, the KCF algorithm does not need to know the kind of the target, but when the target is subjected to scale transformation, shielding and fast movement, a large amount of background information is introduced in the sampling process, and errors are accumulated in the model updating process, so that the tracking frame drifting target is lost. Aiming at the defects in the prior art, the invention adopts the technology of fusing the YOLOv4 and the KCF algorithm, can make up for the respective disadvantages, can exert the respective advantages, improves the success rate of the tracking algorithm, and shows stronger robustness in complex scenes.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a target tracking method and a target tracking system based on artificial intelligence, which solve the problems of low tracking success rate, low efficiency and the like in the prior art.
The invention solves the problems by adopting the following technical scheme:
an artificial intelligence-based target tracking method performs fusion target tracking based on a YOLOv4 target detection algorithm and a KCF rapid tracking algorithm: performing target detection by using a YOLOv4 algorithm; and (3) acquiring a target prediction position by adopting a KCF algorithm, and simultaneously taking the current frame as an input image of a YOLOv4 target detection model to perform target retrieval to obtain accurate target detection position and scale information so as to realize target tracking.
As a preferred technical scheme, the method comprises the following steps:
s1, generating a target detection model: collecting pictures on line and manually labeling targets to form a training data set, and training a YOLOv4 deep learning model by using the training data set to generate a YOLOv4 target detection model;
s2, KCF target tracker training: detecting a target by using a trained YOLOv4 target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker;
s3, target tracking: tracking the target by adopting a KCF target tracker to obtain a predicted position; meanwhile, taking the current frame as an input image of the YOLOv4 target detection model, and carrying out target retrieval to obtain target detection position and scale information;
s4, confirming the target position: calculating the intersection ratio of the detection position and the prediction position of the target in the current frame image, and if the intersection ratio is smaller than a preset threshold value, using the target detection position as the target position of the current frame; if the intersection ratio is larger than a preset threshold value or the target is detected to be undetected, using the predicted position as the target position of the current frame, and using the target position and scale information of the current frame to update the KCF target tracker;
s5, video target tracking: and repeating the steps S3 to S4 for the next frame of image to realize the tracking of the target.
In step S1, the YOLOv4 deep learning model includes a backbone network CSP network structure, a spatial pyramid pooling layer, a path aggregation network, and a YOLO frame header; the backbone network CSP network structure is used for extracting features of an original image and outputting 3-scale feature graphs; the space pyramid pooling layer and the path aggregation network are used for carrying out feature fusion on the feature graphs with 3 scales; the YOLO frame header is used for predicting the feature map after feature fusion.
As a preferred technical solution, step S3 includes the following steps:
s31, establishing a KCF tracker model: establishing an objective function and an objective of a KCF tracker model;
s32, online matching: obtaining a frequency domain representation of the response value by using the sampling sample and the training sample;
s33, updating a KCF tracker template: the KCF tracker model parameters are updated.
As a preferred technical solution, in step S31, an objective function is established by a ridge regression method:
Figure SMS_1
the object is to minimize the distance between the sampled data and the real object position of the next frame, and the expression is:
Figure SMS_2
in the method, in the process of the invention,
Figure SMS_4
representing sample variable, ++>
Figure SMS_8
Representing the number of sample data>
Figure SMS_11
The expression number is->
Figure SMS_5
Sample characteristics of->
Figure SMS_7
Represents the conjugate transpose->
Figure SMS_10
Representing an objective function +.>
Figure SMS_13
Representing regularization parameters, ++>
Figure SMS_3
The representative column vector represents the weight coefficient,
Figure SMS_6
representing regular items->
Figure SMS_9
Representing sample characteristics->
Figure SMS_12
Is a tag value of (2);
for a pair of
Figure SMS_14
Differentiating to make the derivative be 0, and writing the minimum value of the obtained loss function into a complex domain form as follows:
Figure SMS_15
in the method, in the process of the invention,
Figure SMS_16
representing row vectors +.>
Figure SMS_17
The representation is a column vector; />
Figure SMS_18
The transposed matrix of the conjugate complex number of (2) is->
Figure SMS_19
Using the diagonalized nature of the cyclic matrix
Figure SMS_20
Representation in the Fourier domain->
Figure SMS_21
In the method, in the process of the invention,
Figure SMS_22
representation->
Figure SMS_23
Representation in the frequency domain, < >>
Figure SMS_24
Representation->
Figure SMS_25
Representation in the frequency domain, < >>
Figure SMS_26
Representing multiplication by element.
As a preferred embodiment, in step S31, a Gaussian kernel function is introduced
Figure SMS_27
Will->
Figure SMS_28
Solution to transform into high-dimensional weights +.>
Figure SMS_29
Is solved by (1):
Figure SMS_30
in the method, in the process of the invention,
Figure SMS_31
a kernel matrix representing a kernel space;
Figure SMS_32
the representation in the frequency domain is: />
Figure SMS_33
In the method, in the process of the invention,
Figure SMS_34
representation->
Figure SMS_35
Representation in the frequency domain, < >>
Figure SMS_36
Representation->
Figure SMS_37
Is the fourier transform of the first row of (c).
As a preferred embodiment, in step S32, the frequency domain of the response value is represented as:
Figure SMS_38
in the method, in the process of the invention,
Figure SMS_39
representing the kernel matrix->
Figure SMS_40
First row->
Figure SMS_41
Nuclear matrix representing similarity of sampled samples and training samples, which is transformed by Fourier inversion>
Figure SMS_42
Conversion from frequency domain to time domain->
Figure SMS_43
Find->
Figure SMS_44
The position corresponding to the maximum target.
As a preferred technical solution, in step S33, the model parameters at the past time are sampled and combined, and the update formula of the model parameters added by the bilinear interpolation method is updated as follows:
Figure SMS_45
Figure SMS_46
in the method, in the process of the invention,
Figure SMS_48
representing a new training sample set,/->
Figure SMS_51
Representing update step size, +.>
Figure SMS_52
Representing the old sample set, +.>
Figure SMS_49
Representing the filter parameters +.>
Figure SMS_50
Representing training sample set, ++>
Figure SMS_53
Representing the old training sample set, +.>
Figure SMS_54
Representation->
Figure SMS_47
Representation in the frequency domain.
As a preferable technical solution, in step S4, a calculation formula of the intersection ratio of the detected position and the predicted position of the target is:
Figure SMS_55
Figure SMS_56
Figure SMS_57
Figure SMS_58
Figure SMS_59
Figure SMS_60
Figure SMS_61
Figure SMS_62
Figure SMS_63
=/>
Figure SMS_64
Figure SMS_65
=/>
Figure SMS_66
Figure SMS_67
in the method, in the process of the invention,
Figure SMS_72
indicating the cross ratio of the detected position and the predicted position of the target,/->
Figure SMS_70
Representing the area of the target frame +.>
Figure SMS_75
Indicating the area of the detection frame +.>
Figure SMS_71
An abscissa representing the vertex on the target frame located on the rectangle where the target frame intersects the detection frame,/->
Figure SMS_76
Representing the ordinate of the vertices on the target frame that lie on the rectangle where the target frame intersects the detection frame,
Figure SMS_81
an abscissa representing a vertex on the target frame that is on the same diagonal as a vertex on a rectangle that intersects the target frame and the detection frame, +.>
Figure SMS_86
An ordinate representing the vertex on the target frame that is on the same diagonal as the vertex on the rectangle where the target frame intersects the detection frame, +.>
Figure SMS_80
An abscissa representing the vertex on the rectangle on the detection frame where the target frame intersects the detection frame,/->
Figure SMS_84
Ordinate representing the vertex on the detection frame located on the rectangle where the target frame intersects the detection frame,/->
Figure SMS_68
An abscissa representing the vertex on the detection frame that is on the same diagonal as the vertex on the rectangle that intersects the target frame and the detection frame, +.>
Figure SMS_74
Representing the ordinate of the vertex on the detection frame that is on the same diagonal as the vertex on the rectangle that intersects the target frame and the detection frame, +.>
Figure SMS_78
An abscissa representing one diagonal point of a rectangle where the target frame intersects the detection frame, +.>
Figure SMS_83
Ordinate representing one diagonal point of a rectangle where the target frame intersects the detection frame, +.>
Figure SMS_82
Another diagonal point of the rectangle representing the intersection of the target frame and the detection frame, abscissa, +.>
Figure SMS_85
Another diagonal point of the rectangle representing the intersection of the target frame and the detection frame, ordinate, +.>
Figure SMS_69
Representing the length of the target frame +.>
Figure SMS_77
Representing the width of the target frame +.>
Figure SMS_73
Indicating the length of the detection frame, < >>
Figure SMS_79
Indicating the width of the detection frame.
An artificial intelligence-based target tracking system for realizing the artificial intelligence-based target tracking method comprises the following modules connected in sequence:
the target detection model generation module: the method comprises the steps of collecting pictures offline and manually labeling targets to form a training data set, and training a YOLOv4 deep learning model by using the training data set to generate a YOLOv4 target detection model;
KCF target tracker training module: the method comprises the steps of detecting a target by using a trained YOLOv4 target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker;
a target tracking module: the method comprises the steps of tracking a target by adopting a KCF target tracker to obtain a predicted position; meanwhile, taking the current frame as an input image of the YOLOv4 target detection model, and carrying out target retrieval to obtain target detection position and scale information;
a target position confirmation module: the method comprises the steps of calculating the intersection ratio of a detection position and a prediction position of a target in a current frame image, and taking the target detection position as the target position of the current frame if the intersection ratio is smaller than a preset threshold; if the intersection ratio is larger than a preset threshold value or the target is detected to be undetected, using the predicted position as the target position of the current frame, and using the target position and scale information of the current frame to update the KCF target tracker;
video target tracking module: the method is used for repeatedly tracking the target and confirming the position of the target for the next frame of image, so that the target is tracked.
Compared with the prior art, the invention has the following beneficial effects:
(1) The YOLOv4 adopted by the invention is an end-to-end real-time target detection algorithm based on deep learning, and is tested on an MScoco data set by using a Tesla V100 graphics card, so that the precision of 43.5% mAP (65.7% AP) can be achieved, the speed of 65FPS is achieved, and compared with the detection precision AP and the speed FPS of EfficientDet and spineNet, the detection precision AP and the speed FPS are respectively improved by 10% and 12%, and the improvement effect is remarkable; the Yolov4 extracts target characteristics through a deep convolution network, so that a weak and small target in a photoelectric image can be effectively detected; in addition, YOLOv4 is based on multi-scale target detection, so that the influence caused by scale change of a target in the detection process is overcome, and the accuracy and the robustness of target detection are improved;
(2) The KCF algorithm adopted by the invention also carries out a large amount of study on background information, the classifier has high accuracy in distinguishing between the background and the target, and has robustness in a complex environment, and the method is excellent in the aspect of disclosure of a data set at present; meanwhile, the KCF algorithm adopts an on-line training strategy, so that a large number of target samples do not need to be prepared in advance for training the model.
Drawings
FIG. 1 is a schematic diagram of steps of an artificial intelligence based target tracking method according to the present invention;
FIG. 2 is a schematic diagram showing the intersection ratio of the detected position and the predicted position of the target.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Example 1
As shown in fig. 1 to 2, the invention is mainly applied to an anti-unmanned aerial vehicle system, wherein a radar and a radio monitoring system are responsible for searching and finding a target, a photoelectric system controls a pan-tilt lens according to target angle and distance data provided by the radar to complete the tasks of target detection, target locking and target tracking, and the invention mainly completes the algorithm implementation of realizing the rapid target tracking and target locking.
In order to solve the problems in the prior art, the invention provides an unmanned aerial vehicle target tracking algorithm method based on artificial intelligence, namely a fusion target tracking algorithm based on a YOLOv4 target detection algorithm and a KCF rapid tracking algorithm, which can effectively improve the success rate and efficiency of the tracking algorithm.
The target detection of the invention adopts YOLOv4 to carry out auxiliary target searching, locking and automatic tracking. YOLOv4 is an end-to-end deep learning regression detection algorithm, and is the target detection algorithm with the most balanced speed and precision so far. By the integration of a plurality of advanced methods, short plates (with high speed, small objects which are not good at detection, and the like) of the YOLO series are fully supplemented, so that the surprising effect and the supergroup speed are achieved. Compared with three main stream target detection algorithms of YOLOv4 and EfficientDet, spineNet, YOLOv4 is an end-to-end real-time target detection algorithm based on deep learning, and is tested on an MScoco data set by using a Tesla V100 video card, so that the accuracy of 43.5% mAP (65.7% AP) can be achieved, the speed of 65FPS can be achieved, and compared with the detection accuracy of AP and speed of FPS of EffiientDet and spineNet, the detection accuracy of AP and speed of FPS are respectively improved by 10% and 12%, and the improvement effect is remarkable. The Yolov4 extracts target characteristics through a deep convolution network, so that a weak and small target in a photoelectric image can be effectively detected; in addition, YOLOv4 is based on multi-scale target detection, so that the influence caused by the scale change of the target in the detection process is overcome, and the accuracy and the robustness of the target detection are improved.
The object tracking of the present invention predicts the size and position of an object in a subsequent frame given the object size and position of an initial frame of a video sequence. Detecting a target by using a trained YOLOv4 target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker. The KCF algorithm adopts an on-line training strategy, and a large number of target samples do not need to be prepared in advance for training the model. In the tracking process, a target tracker is trained based on the current frame of the video, the target position of the next frame is determined by using the tracker, and then the tracker is updated with the new target position, so that continuous tracking of the target is realized through iteration. The fusion target tracking algorithm based on the YOLOv4 and KCF algorithms comprises the following steps, see fig. 1 for details:
step S1: and collecting pictures on line, and performing target manual labeling to form a training data set, and training the YOLOv4 deep learning model by using the training data set to obtain a target detection model. The YOLOv4 deep learning model mainly consists of three parts, namely CSPDarknet53 (CSP network structure), SPP (spatial pyramid pooling layer) +PANeT (Path aggregation network) and YOLO Head (YOLO frame Head): SPDarknet53 is used as a backbone network of the YOLOv4 algorithm and is responsible for extracting features of an original image and outputting feature images with 3 scales; SPP+PANeT is responsible for carrying out feature fusion on the feature graphs of 3 scales extracted by the backbone network; and predicting the YOLO Head by using the feature map after feature fusion. The learning model is based on an end-to-end real-time target detection algorithm of deep learning, and can rapidly improve the target detection precision and speed;
step S2: detecting a specific target by using a trained target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker;
step S3: tracking a specific target by adopting a KCF algorithm to obtain a predicted position; meanwhile, the current frame is used as an input image of a YOLOv4 target detection model, target retrieval is carried out, and accurate target detection position and scale information are obtained; the main steps of the KCF algorithm are 3 links of model establishment, online matching and template updating.
The first link is as follows: establishing a model, and establishing an objective function in a ridge regression mode:
Figure SMS_87
the goal is to minimize the distance of the sampled data from the real target location of the next frame:
Figure SMS_91
wherein: />
Figure SMS_94
Representing sample variable, ++>
Figure SMS_97
Representing the number of sample data>
Figure SMS_90
The expression number is->
Figure SMS_92
Sample characteristics of->
Figure SMS_96
Represents the conjugate transpose->
Figure SMS_98
Representing an objective function +.>
Figure SMS_88
Representing regularization parameters, ++>
Figure SMS_93
Representing column vectors representing weight coefficients, +.>
Figure SMS_95
Representing regular items->
Figure SMS_99
Representing sample characteristics->
Figure SMS_89
Is a label value of (a).
Preventing the model from over fitting. Differentiating W to make the derivative be 0, and obtaining the minimum value by the loss function:
Figure SMS_100
Figure SMS_101
representing row vectors +.>
Figure SMS_102
The representation is a column vector; />
Figure SMS_103
The transposed matrix of the conjugate complex number of (2) is->
Figure SMS_104
The diagonalized nature of the cyclic matrix is used to derive a representation of W in the fourier domain,
Figure SMS_105
the method comprises the steps of carrying out a first treatment on the surface of the In (1) the->
Figure SMS_106
Representation->
Figure SMS_107
Representation in the frequency domain, < >>
Figure SMS_108
Representation->
Figure SMS_109
Representation in the frequency domain, < >>
Figure SMS_110
Representing multiplication by element.
For most cases the solution of W is a nonlinear problem by introducing a Gaussian kernel function
Figure SMS_111
Converting the solution of w into a high-dimensional weight in a high-dimensional space>
Figure SMS_112
:
Figure SMS_113
Wherein the method comprises the steps of
Figure SMS_114
A kernel matrix representing a kernel space.
αRepresentation in the frequency domain:
Figure SMS_115
wherein->
Figure SMS_116
Is the fourier transform of the first row of the K matrix.
And a second link: on-line matching, definition
Figure SMS_117
Is a kernel for representing the similarity between the sampled sample and the training sample in the kernel spaceMatrix, the sampling sample and the training sample are subjected to related operation, and frequency domain representation of the response value is obtained: />
Figure SMS_118
Wherein the method comprises the steps of
Figure SMS_119
Is a nuclear matrix->
Figure SMS_120
Is to be +.>
Figure SMS_121
Conversion from frequency domain to time domain->
Figure SMS_122
Find->
Figure SMS_123
The position corresponding to the maximum value is the obtained position.
And a third link: main pairs in KCF tracker template updating process
Figure SMS_124
And training sample set +.>
Figure SMS_125
Updating is carried out, after the algorithm is executed, a new target prediction position is obtained, a new base sample is obtained, and a cyclic matrix is generated to obtain a new sample set->
Figure SMS_126
Then training to obtain new +.>
Figure SMS_127
Finally, the update step length is set by using the model parameter of the last frame and using the linear interpolation method>
Figure SMS_128
Updating the tracker, sampling and combining model parameters at the past momentThe number is added to the updating process of the model parameters by using a bilinear interpolation method:
Figure SMS_129
Figure SMS_130
in the method, in the process of the invention,
Figure SMS_132
representing a new training sample set,/->
Figure SMS_134
Representing update step size, +.>
Figure SMS_136
Representing the old sample set, +.>
Figure SMS_133
Representing the filter parameters +.>
Figure SMS_135
Representing training sample set, ++>
Figure SMS_137
Representing the old training sample set, +.>
Figure SMS_138
Representation->
Figure SMS_131
Representation in the frequency domain.
Step S4: calculating the intersection ratio of the detection position and the prediction position of the target in the current frame image
Figure SMS_139
The intersection ratio refers to the ratio of the intersection of the target frame and the detection frame to the area of the union; defining the diagonal coordinates of rectangle A and rectangle B as
Figure SMS_140
、/>
Figure SMS_141
At the same time, the diagonal coordinates of the intersection rectangle are defined as +.>
Figure SMS_142
Then the method of calculating the diagonal coordinates of the intersection rectangle is as follows:
Figure SMS_143
Figure SMS_144
Figure SMS_145
Figure SMS_146
the intersection and union are then calculated as follows:
Figure SMS_147
Figure SMS_148
Figure SMS_149
Figure SMS_150
Figure SMS_151
=/>
Figure SMS_152
Figure SMS_153
=/>
Figure SMS_154
Figure SMS_155
=/>
Figure SMS_156
the schematic diagram is shown in fig. 2.
The above calculation can be used to calculate the intersection ratio of the detected position and the predicted position of the target. If the cross ratio is smaller than the preset threshold, using the target detection position as the target position of the current frame; if the cross ratio is larger than a preset threshold value or the target is detected to be undetected, using the predicted position as the target position of the current frame; updating the KCF target tracker by using the current frame target position and scale information;
step S5: and repeating the steps S3 to S4 for the next frame of image to realize the tracking of the video target.
According to the invention, a fusion target tracking algorithm based on the YOLOv4 target detection algorithm and the KCF rapid tracking algorithm is adopted, optimal results are obtained in detection precision and tracking stability, and the method has strong robustness and instantaneity, so that the method can be used for rapidly, efficiently and reliably tracking and locking the target, and effectively improving the overall performance of the unmanned aerial vehicle system. The foregoing disclosure is illustrative of the present invention and is not to be construed as limiting the scope of the invention, which is defined by the appended claims.
As described above, the present invention can be preferably implemented.
All of the features disclosed in all of the embodiments of this specification, or all of the steps in any method or process disclosed implicitly, except for the mutually exclusive features and/or steps, may be combined and/or expanded and substituted in any way.
The foregoing description of the preferred embodiment of the invention is not intended to limit the invention in any way, but rather to cover all modifications, equivalents, improvements and alternatives falling within the spirit and principles of the invention.

Claims (10)

1. The target tracking method based on artificial intelligence is characterized in that fusion target tracking is performed based on a YOLOv4 target detection algorithm and a KCF rapid tracking algorithm: performing target detection by using a YOLOv4 algorithm; and (3) acquiring a target prediction position by adopting a KCF algorithm, and simultaneously taking the current frame as an input image of a YOLOv4 target detection model to perform target retrieval to obtain accurate target detection position and scale information so as to realize target tracking.
2. The artificial intelligence based object tracking method of claim 1, comprising the steps of:
s1, generating a target detection model: collecting pictures on line and manually labeling targets to form a training data set, and training a YOLOv4 deep learning model by using the training data set to generate a YOLOv4 target detection model;
s2, KCF target tracker training: detecting a target by using a trained YOLOv4 target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker;
s3, target tracking: tracking the target by adopting a KCF target tracker to obtain a predicted position; meanwhile, taking the current frame as an input image of the YOLOv4 target detection model, and carrying out target retrieval to obtain target detection position and scale information;
s4, confirming the target position: calculating the intersection ratio of the detection position and the prediction position of the target in the current frame image, and if the intersection ratio is smaller than a preset threshold value, using the target detection position as the target position of the current frame; if the intersection ratio is larger than a preset threshold value or the target is detected to be undetected, using the predicted position as the target position of the current frame, and using the target position and scale information of the current frame to update the KCF target tracker;
s5, video target tracking: and repeating the steps S3 to S4 for the next frame of image to realize the tracking of the target.
3. The method of claim 2, wherein in step S1, the YOLOv4 deep learning model includes a backbone network CSP network structure, a spatial pyramid pooling layer and path aggregation network, and a YOLO frame header; the backbone network CSP network structure is used for extracting features of an original image and outputting 3-scale feature graphs; the space pyramid pooling layer and the path aggregation network are used for carrying out feature fusion on the feature graphs with 3 scales; the YOLO frame header is used for predicting the feature map after feature fusion.
4. An artificial intelligence based object tracking method according to claim 3, characterised in that step S3 comprises the steps of:
s31, establishing a KCF tracker model: establishing an objective function and an objective of a KCF tracker model;
s32, online matching: obtaining a frequency domain representation of the response value by using the sampling sample and the training sample;
s33, updating a KCF tracker template: the KCF tracker model parameters are updated.
5. The artificial intelligence based object tracking method according to claim 4, wherein in step S31, the object function is established by a ridge regression method:
Figure QLYQS_1
the object is to minimize the distance between the sampled data and the real object position of the next frame, and the expression is:
Figure QLYQS_2
in the method, in the process of the invention,
Figure QLYQS_5
representing sample variable, ++>
Figure QLYQS_8
Representing the number of sample data>
Figure QLYQS_11
The expression number is->
Figure QLYQS_4
Sample characteristics of->
Figure QLYQS_7
Represents the conjugate transpose->
Figure QLYQS_10
Representing an objective function +.>
Figure QLYQS_13
Representing regularization parameters, ++>
Figure QLYQS_3
Representing column vectors representing weight coefficients, +.>
Figure QLYQS_6
Representing regular items->
Figure QLYQS_9
Representing sample characteristics->
Figure QLYQS_12
Is a tag value of (2);
for a pair of
Figure QLYQS_14
Differentiating to make the derivative be 0, and writing the minimum value of the obtained loss function into a complex domain form as follows:
Figure QLYQS_15
in the method, in the process of the invention,
Figure QLYQS_16
representing row vectors +.>
Figure QLYQS_17
The representation is a column vector; />
Figure QLYQS_18
The transposed matrix of the conjugate complex number of (2) is->
Figure QLYQS_19
Using the diagonalized nature of the cyclic matrix
Figure QLYQS_20
Representation in the Fourier domain->
Figure QLYQS_21
In the method, in the process of the invention,
Figure QLYQS_22
representation->
Figure QLYQS_23
Representation in the frequency domain, < >>
Figure QLYQS_24
Representation->
Figure QLYQS_25
Representation in the frequency domain, < >>
Figure QLYQS_26
Representing multiplication by element.
6. The artificial intelligence based object tracking method of claim 5, wherein the artificial intelligence based object tracking method comprises the following stepsIn step S31, a Gaussian kernel function is introduced
Figure QLYQS_27
Will->
Figure QLYQS_28
Solution to transform into high-dimensional weights +.>
Figure QLYQS_29
Is solved by (1):
Figure QLYQS_30
in the method, in the process of the invention,
Figure QLYQS_31
a kernel matrix representing a kernel space;
Figure QLYQS_32
the representation in the frequency domain is: />
Figure QLYQS_33
In the method, in the process of the invention,
Figure QLYQS_34
representation->
Figure QLYQS_35
Representation in the frequency domain, < >>
Figure QLYQS_36
Representation->
Figure QLYQS_37
Is the fourier transform of the first row of (c).
7. The artificial intelligence based object tracking method of claim 6, wherein in step S32, the frequency domain of the response value is expressed as:
Figure QLYQS_38
in the method, in the process of the invention,
Figure QLYQS_39
representing the kernel matrix->
Figure QLYQS_40
First row->
Figure QLYQS_41
Nuclear matrix representing similarity of sampled samples and training samples, which is transformed by Fourier inversion>
Figure QLYQS_42
Conversion from frequency domain to time domain->
Figure QLYQS_43
Find->
Figure QLYQS_44
The maximum value is the position corresponding to the target.
8. The method according to claim 7, wherein in step S33, the model parameters at the past time are sampled and combined, and the update process of adding the model parameters by using the bilinear interpolation method updates the formula as follows:
Figure QLYQS_45
Figure QLYQS_46
in the method, in the process of the invention,
Figure QLYQS_48
representing a new training sample set,/->
Figure QLYQS_50
Representing update step size, +.>
Figure QLYQS_52
Representing the old sample set, +.>
Figure QLYQS_49
Representing the filter parameters +.>
Figure QLYQS_51
Representing training sample set, ++>
Figure QLYQS_53
Representing the old training sample set, +.>
Figure QLYQS_54
Representation->
Figure QLYQS_47
Representation in the frequency domain.
9. The method according to any one of claims 2 to 8, wherein in step S4, a calculation formula of an intersection ratio of the detected position and the predicted position of the target is:
Figure QLYQS_55
Figure QLYQS_56
Figure QLYQS_57
Figure QLYQS_58
Figure QLYQS_59
Figure QLYQS_60
Figure QLYQS_61
Figure QLYQS_62
Figure QLYQS_63
=/>
Figure QLYQS_64
Figure QLYQS_65
=/>
Figure QLYQS_66
Figure QLYQS_67
in the method, in the process of the invention,
Figure QLYQS_82
indicating the cross ratio of the detected position and the predicted position of the target,/->
Figure QLYQS_71
Surface representing target frameAccumulation of pathogenic qi>
Figure QLYQS_77
Indicating the area of the detection frame +.>
Figure QLYQS_81
An abscissa representing the vertex on the target frame located on the rectangle where the target frame intersects the detection frame,/->
Figure QLYQS_84
Ordinate representing the vertex on the rectangle on the target frame intersecting the detection frame,/->
Figure QLYQS_83
An abscissa representing a vertex on the target frame that is on the same diagonal as a vertex on a rectangle that intersects the target frame and the detection frame, +.>
Figure QLYQS_86
An ordinate representing the vertex on the target frame that is on the same diagonal as the vertex on the rectangle where the target frame intersects the detection frame, +.>
Figure QLYQS_80
An abscissa representing the vertex on the rectangle on the detection frame where the target frame intersects the detection frame,/->
Figure QLYQS_85
Representing the ordinate of the vertices on the detection frame that lie on the rectangle where the target frame intersects the detection frame,
Figure QLYQS_68
an abscissa representing the vertex on the detection frame that is on the same diagonal as the vertex on the rectangle that intersects the target frame and the detection frame, +.>
Figure QLYQS_74
Representing the rectangle on the detection frame intersecting the target frame and the detection frameIs on the ordinate of the vertex of the same diagonal,/->
Figure QLYQS_72
An abscissa representing one diagonal point of a rectangle where the target frame intersects the detection frame, +.>
Figure QLYQS_78
Ordinate representing one diagonal point of a rectangle where the target frame intersects the detection frame, +.>
Figure QLYQS_73
Another diagonal point of the rectangle representing the intersection of the target frame and the detection frame, abscissa, +.>
Figure QLYQS_79
Another diagonal point of the rectangle representing the intersection of the target frame and the detection frame, ordinate, +.>
Figure QLYQS_69
Representing the length of the target frame +.>
Figure QLYQS_76
Representing the width of the target frame +.>
Figure QLYQS_70
Indicating the length of the detection frame, < >>
Figure QLYQS_75
Indicating the width of the detection frame.
10. An artificial intelligence based object tracking system for implementing an artificial intelligence based object tracking method according to any one of claims 1 to 9, comprising the following modules connected in sequence:
the target detection model generation module: the method comprises the steps of collecting pictures offline and manually labeling targets to form a training data set, and training a YOLOv4 deep learning model by using the training data set to generate a YOLOv4 target detection model;
KCF target tracker training module: the method comprises the steps of detecting a target by using a trained YOLOv4 target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker;
a target tracking module: the method comprises the steps of tracking a target by adopting a KCF target tracker to obtain a predicted position; meanwhile, taking the current frame as an input image of the YOLOv4 target detection model, and carrying out target retrieval to obtain target detection position and scale information;
a target position confirmation module: the method comprises the steps of calculating the intersection ratio of a detection position and a prediction position of a target in a current frame image, and taking the target detection position as the target position of the current frame if the intersection ratio is smaller than a preset threshold; if the intersection ratio is larger than a preset threshold value or the target is detected to be undetected, using the predicted position as the target position of the current frame, and using the target position and scale information of the current frame to update the KCF target tracker;
video target tracking module: the method is used for repeatedly tracking the target and confirming the position of the target for the next frame of image, so that the target is tracked.
CN202310653926.XA 2023-06-05 2023-06-05 Target tracking method and system based on artificial intelligence Pending CN116385498A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310653926.XA CN116385498A (en) 2023-06-05 2023-06-05 Target tracking method and system based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310653926.XA CN116385498A (en) 2023-06-05 2023-06-05 Target tracking method and system based on artificial intelligence

Publications (1)

Publication Number Publication Date
CN116385498A true CN116385498A (en) 2023-07-04

Family

ID=86977282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310653926.XA Pending CN116385498A (en) 2023-06-05 2023-06-05 Target tracking method and system based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN116385498A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111292355A (en) * 2020-02-12 2020-06-16 江南大学 Nuclear correlation filtering multi-target tracking method fusing motion information
CN111582349A (en) * 2020-04-30 2020-08-25 陕西师范大学 Improved target tracking algorithm based on YOLOv3 and kernel correlation filtering

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111292355A (en) * 2020-02-12 2020-06-16 江南大学 Nuclear correlation filtering multi-target tracking method fusing motion information
CN111582349A (en) * 2020-04-30 2020-08-25 陕西师范大学 Improved target tracking algorithm based on YOLOv3 and kernel correlation filtering

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘伟: "基于视觉的无人机识别与跟踪技术研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》, no. 1, pages 031 - 816 *
解滋坤等: "一种基于Yolo V4-tiny和KCF的目标跟踪融合算法分析", 《电子技术》, vol. 51, no. 10, pages 309 - 311 *

Similar Documents

Publication Publication Date Title
CN110135366B (en) Shielded pedestrian re-identification method based on multi-scale generation countermeasure network
CN110232350B (en) Real-time water surface multi-moving-object detection and tracking method based on online learning
CN109543606A (en) A kind of face identification method that attention mechanism is added
CN110533691B (en) Target tracking method, device and storage medium based on multiple classifiers
CN113012203A (en) High-precision multi-target tracking method under complex background
CN112364931B (en) Few-sample target detection method and network system based on meta-feature and weight adjustment
Chen et al. Learning linear regression via single-convolutional layer for visual object tracking
CN111814661A (en) Human behavior identification method based on residual error-recurrent neural network
CN110084165A (en) The intelligent recognition and method for early warning of anomalous event under the open scene of power domain based on edge calculations
CN112183675B (en) Tracking method for low-resolution target based on twin network
CN110310305B (en) Target tracking method and device based on BSSD detection and Kalman filtering
CN116524062B (en) Diffusion model-based 2D human body posture estimation method
CN115937254B (en) Multi-aerial flying target tracking method and system based on semi-supervised learning
CN111898566B (en) Attitude estimation method, attitude estimation device, electronic equipment and storage medium
CN111402303A (en) Target tracking architecture based on KFSTRCF
CN116630376A (en) Unmanned aerial vehicle multi-target tracking method based on ByteTrack
Zhang et al. Unsupervised depth estimation from monocular videos with hybrid geometric-refined loss and contextual attention
CN115359407A (en) Multi-vehicle tracking method in video
Song et al. 2d lidar map prediction via estimating motion flow with gru
CN113920168B (en) Image tracking method in audio/video control equipment
CN113033356B (en) Scale-adaptive long-term correlation target tracking method
CN112991394B (en) KCF target tracking method based on cubic spline interpolation and Markov chain
CN116386089B (en) Human body posture estimation method, device, equipment and storage medium under motion scene
Zhu et al. A moving infrared small target detection method based on optical flow-guided neural networks
CN110689559A (en) Visual target tracking method based on dense convolutional network characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20230704