CN116385498A - Target tracking method and system based on artificial intelligence - Google Patents
Target tracking method and system based on artificial intelligence Download PDFInfo
- Publication number
- CN116385498A CN116385498A CN202310653926.XA CN202310653926A CN116385498A CN 116385498 A CN116385498 A CN 116385498A CN 202310653926 A CN202310653926 A CN 202310653926A CN 116385498 A CN116385498 A CN 116385498A
- Authority
- CN
- China
- Prior art keywords
- target
- frame
- representing
- detection
- tracking
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 90
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 19
- 238000001514 detection method Methods 0.000 claims abstract description 112
- 230000004927 fusion Effects 0.000 claims abstract description 12
- 238000012549 training Methods 0.000 claims description 46
- 230000008569 process Effects 0.000 claims description 25
- 230000006870 function Effects 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 11
- 239000013598 vector Substances 0.000 claims description 9
- 238000013136 deep learning model Methods 0.000 claims description 8
- 230000002776 aggregation Effects 0.000 claims description 5
- 238000004220 aggregation Methods 0.000 claims description 5
- 238000002372 labelling Methods 0.000 claims description 5
- 238000011176 pooling Methods 0.000 claims description 5
- 230000004044 response Effects 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 5
- 125000004122 cyclic group Chemical group 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 102000008297 Nuclear Matrix-Associated Proteins Human genes 0.000 claims description 2
- 108010035916 Nuclear Matrix-Associated Proteins Proteins 0.000 claims description 2
- 238000012790 confirmation Methods 0.000 claims description 2
- 210000000299 nuclear matrix Anatomy 0.000 claims description 2
- 230000001717 pathogenic effect Effects 0.000 claims 1
- 238000013135 deep learning Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Multimedia (AREA)
- Radar Systems Or Details Thereof (AREA)
Abstract
The invention relates to the technical field of target tracking, and discloses a target tracking method and system based on artificial intelligence, wherein the method is used for carrying out fusion target tracking based on a YOLOv4 target detection algorithm and a KCF rapid tracking algorithm: performing target detection by using a YOLOv4 algorithm; and (3) acquiring a target prediction position by adopting a KCF algorithm, and simultaneously taking the current frame as an input image of a YOLOv4 target detection model to perform target retrieval to obtain accurate target detection position and scale information so as to realize target tracking. The invention solves the problems of low tracking success rate, low efficiency and the like in the prior art.
Description
Technical Field
The invention relates to the technical field of target tracking, in particular to a target tracking method and system based on artificial intelligence.
Background
Currently, common target tracking algorithms are YOLOv4 algorithm (deep learning regression detection algorithm) and KCF algorithm (kernel-related filter tracking algorithm). The YOLOv4 algorithm has good detection and tracking capability on complete targets in a simple scene, has strong robustness on scale change, deformation and the like, can solve the problems of target shielding and high-speed maneuvering, but can only detect and track known targets, and the detection and tracking effect is to be improved under the conditions of long distance, small targets and unobvious target characteristics. However, the KCF algorithm does not need to know the kind of the target, but when the target is subjected to scale transformation, shielding and fast movement, a large amount of background information is introduced in the sampling process, and errors are accumulated in the model updating process, so that the tracking frame drifting target is lost. Aiming at the defects in the prior art, the invention adopts the technology of fusing the YOLOv4 and the KCF algorithm, can make up for the respective disadvantages, can exert the respective advantages, improves the success rate of the tracking algorithm, and shows stronger robustness in complex scenes.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a target tracking method and a target tracking system based on artificial intelligence, which solve the problems of low tracking success rate, low efficiency and the like in the prior art.
The invention solves the problems by adopting the following technical scheme:
an artificial intelligence-based target tracking method performs fusion target tracking based on a YOLOv4 target detection algorithm and a KCF rapid tracking algorithm: performing target detection by using a YOLOv4 algorithm; and (3) acquiring a target prediction position by adopting a KCF algorithm, and simultaneously taking the current frame as an input image of a YOLOv4 target detection model to perform target retrieval to obtain accurate target detection position and scale information so as to realize target tracking.
As a preferred technical scheme, the method comprises the following steps:
s1, generating a target detection model: collecting pictures on line and manually labeling targets to form a training data set, and training a YOLOv4 deep learning model by using the training data set to generate a YOLOv4 target detection model;
s2, KCF target tracker training: detecting a target by using a trained YOLOv4 target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker;
s3, target tracking: tracking the target by adopting a KCF target tracker to obtain a predicted position; meanwhile, taking the current frame as an input image of the YOLOv4 target detection model, and carrying out target retrieval to obtain target detection position and scale information;
s4, confirming the target position: calculating the intersection ratio of the detection position and the prediction position of the target in the current frame image, and if the intersection ratio is smaller than a preset threshold value, using the target detection position as the target position of the current frame; if the intersection ratio is larger than a preset threshold value or the target is detected to be undetected, using the predicted position as the target position of the current frame, and using the target position and scale information of the current frame to update the KCF target tracker;
s5, video target tracking: and repeating the steps S3 to S4 for the next frame of image to realize the tracking of the target.
In step S1, the YOLOv4 deep learning model includes a backbone network CSP network structure, a spatial pyramid pooling layer, a path aggregation network, and a YOLO frame header; the backbone network CSP network structure is used for extracting features of an original image and outputting 3-scale feature graphs; the space pyramid pooling layer and the path aggregation network are used for carrying out feature fusion on the feature graphs with 3 scales; the YOLO frame header is used for predicting the feature map after feature fusion.
As a preferred technical solution, step S3 includes the following steps:
s31, establishing a KCF tracker model: establishing an objective function and an objective of a KCF tracker model;
s32, online matching: obtaining a frequency domain representation of the response value by using the sampling sample and the training sample;
s33, updating a KCF tracker template: the KCF tracker model parameters are updated.
As a preferred technical solution, in step S31, an objective function is established by a ridge regression method:the object is to minimize the distance between the sampled data and the real object position of the next frame, and the expression is:
in the method, in the process of the invention,representing sample variable, ++>Representing the number of sample data>The expression number is->Sample characteristics of->Represents the conjugate transpose->Representing an objective function +.>Representing regularization parameters, ++>The representative column vector represents the weight coefficient,representing regular items->Representing sample characteristics->Is a tag value of (2);
for a pair ofDifferentiating to make the derivative be 0, and writing the minimum value of the obtained loss function into a complex domain form as follows:
in the method, in the process of the invention,representing row vectors +.>The representation is a column vector; />The transposed matrix of the conjugate complex number of (2) is->。
In the method, in the process of the invention,representation->Representation in the frequency domain, < >>Representation->Representation in the frequency domain, < >>Representing multiplication by element.
As a preferred embodiment, in step S31, a Gaussian kernel function is introducedWill->Solution to transform into high-dimensional weights +.>Is solved by (1):
In the method, in the process of the invention,representation->Representation in the frequency domain, < >>Representation->Is the fourier transform of the first row of (c).
As a preferred embodiment, in step S32, the frequency domain of the response value is represented as:
in the method, in the process of the invention,representing the kernel matrix->First row->Nuclear matrix representing similarity of sampled samples and training samples, which is transformed by Fourier inversion>Conversion from frequency domain to time domain->Find->The position corresponding to the maximum target.
As a preferred technical solution, in step S33, the model parameters at the past time are sampled and combined, and the update formula of the model parameters added by the bilinear interpolation method is updated as follows:
in the method, in the process of the invention,representing a new training sample set,/->Representing update step size, +.>Representing the old sample set, +.>Representing the filter parameters +.>Representing training sample set, ++>Representing the old training sample set, +.>Representation->Representation in the frequency domain.
As a preferable technical solution, in step S4, a calculation formula of the intersection ratio of the detected position and the predicted position of the target is:
in the method, in the process of the invention,indicating the cross ratio of the detected position and the predicted position of the target,/->Representing the area of the target frame +.>Indicating the area of the detection frame +.>An abscissa representing the vertex on the target frame located on the rectangle where the target frame intersects the detection frame,/->Representing the ordinate of the vertices on the target frame that lie on the rectangle where the target frame intersects the detection frame,an abscissa representing a vertex on the target frame that is on the same diagonal as a vertex on a rectangle that intersects the target frame and the detection frame, +.>An ordinate representing the vertex on the target frame that is on the same diagonal as the vertex on the rectangle where the target frame intersects the detection frame, +.>An abscissa representing the vertex on the rectangle on the detection frame where the target frame intersects the detection frame,/->Ordinate representing the vertex on the detection frame located on the rectangle where the target frame intersects the detection frame,/->An abscissa representing the vertex on the detection frame that is on the same diagonal as the vertex on the rectangle that intersects the target frame and the detection frame, +.>Representing the ordinate of the vertex on the detection frame that is on the same diagonal as the vertex on the rectangle that intersects the target frame and the detection frame, +.>An abscissa representing one diagonal point of a rectangle where the target frame intersects the detection frame, +.>Ordinate representing one diagonal point of a rectangle where the target frame intersects the detection frame, +.>Another diagonal point of the rectangle representing the intersection of the target frame and the detection frame, abscissa, +.>Another diagonal point of the rectangle representing the intersection of the target frame and the detection frame, ordinate, +.>Representing the length of the target frame +.>Representing the width of the target frame +.>Indicating the length of the detection frame, < >>Indicating the width of the detection frame.
An artificial intelligence-based target tracking system for realizing the artificial intelligence-based target tracking method comprises the following modules connected in sequence:
the target detection model generation module: the method comprises the steps of collecting pictures offline and manually labeling targets to form a training data set, and training a YOLOv4 deep learning model by using the training data set to generate a YOLOv4 target detection model;
KCF target tracker training module: the method comprises the steps of detecting a target by using a trained YOLOv4 target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker;
a target tracking module: the method comprises the steps of tracking a target by adopting a KCF target tracker to obtain a predicted position; meanwhile, taking the current frame as an input image of the YOLOv4 target detection model, and carrying out target retrieval to obtain target detection position and scale information;
a target position confirmation module: the method comprises the steps of calculating the intersection ratio of a detection position and a prediction position of a target in a current frame image, and taking the target detection position as the target position of the current frame if the intersection ratio is smaller than a preset threshold; if the intersection ratio is larger than a preset threshold value or the target is detected to be undetected, using the predicted position as the target position of the current frame, and using the target position and scale information of the current frame to update the KCF target tracker;
video target tracking module: the method is used for repeatedly tracking the target and confirming the position of the target for the next frame of image, so that the target is tracked.
Compared with the prior art, the invention has the following beneficial effects:
(1) The YOLOv4 adopted by the invention is an end-to-end real-time target detection algorithm based on deep learning, and is tested on an MScoco data set by using a Tesla V100 graphics card, so that the precision of 43.5% mAP (65.7% AP) can be achieved, the speed of 65FPS is achieved, and compared with the detection precision AP and the speed FPS of EfficientDet and spineNet, the detection precision AP and the speed FPS are respectively improved by 10% and 12%, and the improvement effect is remarkable; the Yolov4 extracts target characteristics through a deep convolution network, so that a weak and small target in a photoelectric image can be effectively detected; in addition, YOLOv4 is based on multi-scale target detection, so that the influence caused by scale change of a target in the detection process is overcome, and the accuracy and the robustness of target detection are improved;
(2) The KCF algorithm adopted by the invention also carries out a large amount of study on background information, the classifier has high accuracy in distinguishing between the background and the target, and has robustness in a complex environment, and the method is excellent in the aspect of disclosure of a data set at present; meanwhile, the KCF algorithm adopts an on-line training strategy, so that a large number of target samples do not need to be prepared in advance for training the model.
Drawings
FIG. 1 is a schematic diagram of steps of an artificial intelligence based target tracking method according to the present invention;
FIG. 2 is a schematic diagram showing the intersection ratio of the detected position and the predicted position of the target.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Example 1
As shown in fig. 1 to 2, the invention is mainly applied to an anti-unmanned aerial vehicle system, wherein a radar and a radio monitoring system are responsible for searching and finding a target, a photoelectric system controls a pan-tilt lens according to target angle and distance data provided by the radar to complete the tasks of target detection, target locking and target tracking, and the invention mainly completes the algorithm implementation of realizing the rapid target tracking and target locking.
In order to solve the problems in the prior art, the invention provides an unmanned aerial vehicle target tracking algorithm method based on artificial intelligence, namely a fusion target tracking algorithm based on a YOLOv4 target detection algorithm and a KCF rapid tracking algorithm, which can effectively improve the success rate and efficiency of the tracking algorithm.
The target detection of the invention adopts YOLOv4 to carry out auxiliary target searching, locking and automatic tracking. YOLOv4 is an end-to-end deep learning regression detection algorithm, and is the target detection algorithm with the most balanced speed and precision so far. By the integration of a plurality of advanced methods, short plates (with high speed, small objects which are not good at detection, and the like) of the YOLO series are fully supplemented, so that the surprising effect and the supergroup speed are achieved. Compared with three main stream target detection algorithms of YOLOv4 and EfficientDet, spineNet, YOLOv4 is an end-to-end real-time target detection algorithm based on deep learning, and is tested on an MScoco data set by using a Tesla V100 video card, so that the accuracy of 43.5% mAP (65.7% AP) can be achieved, the speed of 65FPS can be achieved, and compared with the detection accuracy of AP and speed of FPS of EffiientDet and spineNet, the detection accuracy of AP and speed of FPS are respectively improved by 10% and 12%, and the improvement effect is remarkable. The Yolov4 extracts target characteristics through a deep convolution network, so that a weak and small target in a photoelectric image can be effectively detected; in addition, YOLOv4 is based on multi-scale target detection, so that the influence caused by the scale change of the target in the detection process is overcome, and the accuracy and the robustness of the target detection are improved.
The object tracking of the present invention predicts the size and position of an object in a subsequent frame given the object size and position of an initial frame of a video sequence. Detecting a target by using a trained YOLOv4 target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker. The KCF algorithm adopts an on-line training strategy, and a large number of target samples do not need to be prepared in advance for training the model. In the tracking process, a target tracker is trained based on the current frame of the video, the target position of the next frame is determined by using the tracker, and then the tracker is updated with the new target position, so that continuous tracking of the target is realized through iteration. The fusion target tracking algorithm based on the YOLOv4 and KCF algorithms comprises the following steps, see fig. 1 for details:
step S1: and collecting pictures on line, and performing target manual labeling to form a training data set, and training the YOLOv4 deep learning model by using the training data set to obtain a target detection model. The YOLOv4 deep learning model mainly consists of three parts, namely CSPDarknet53 (CSP network structure), SPP (spatial pyramid pooling layer) +PANeT (Path aggregation network) and YOLO Head (YOLO frame Head): SPDarknet53 is used as a backbone network of the YOLOv4 algorithm and is responsible for extracting features of an original image and outputting feature images with 3 scales; SPP+PANeT is responsible for carrying out feature fusion on the feature graphs of 3 scales extracted by the backbone network; and predicting the YOLO Head by using the feature map after feature fusion. The learning model is based on an end-to-end real-time target detection algorithm of deep learning, and can rapidly improve the target detection precision and speed;
step S2: detecting a specific target by using a trained target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker;
step S3: tracking a specific target by adopting a KCF algorithm to obtain a predicted position; meanwhile, the current frame is used as an input image of a YOLOv4 target detection model, target retrieval is carried out, and accurate target detection position and scale information are obtained; the main steps of the KCF algorithm are 3 links of model establishment, online matching and template updating.
The first link is as follows: establishing a model, and establishing an objective function in a ridge regression mode:
the goal is to minimize the distance of the sampled data from the real target location of the next frame:wherein: />Representing sample variable, ++>Representing the number of sample data>The expression number is->Sample characteristics of->Represents the conjugate transpose->Representing an objective function +.>Representing regularization parameters, ++>Representing column vectors representing weight coefficients, +.>Representing regular items->Representing sample characteristics->Is a label value of (a).
Preventing the model from over fitting. Differentiating W to make the derivative be 0, and obtaining the minimum value by the loss function:
representing row vectors +.>The representation is a column vector; />The transposed matrix of the conjugate complex number of (2) is->。
The diagonalized nature of the cyclic matrix is used to derive a representation of W in the fourier domain,the method comprises the steps of carrying out a first treatment on the surface of the In (1) the->Representation->Representation in the frequency domain, < >>Representation->Representation in the frequency domain, < >>Representing multiplication by element.
For most cases the solution of W is a nonlinear problem by introducing a Gaussian kernel functionConverting the solution of w into a high-dimensional weight in a high-dimensional space>:
αRepresentation in the frequency domain:wherein->Is the fourier transform of the first row of the K matrix.
And a second link: on-line matching, definitionIs a kernel for representing the similarity between the sampled sample and the training sample in the kernel spaceMatrix, the sampling sample and the training sample are subjected to related operation, and frequency domain representation of the response value is obtained: />
Wherein the method comprises the steps ofIs a nuclear matrix->Is to be +.>Conversion from frequency domain to time domain->Find->The position corresponding to the maximum value is the obtained position.
And a third link: main pairs in KCF tracker template updating processAnd training sample set +.>Updating is carried out, after the algorithm is executed, a new target prediction position is obtained, a new base sample is obtained, and a cyclic matrix is generated to obtain a new sample set->Then training to obtain new +.>Finally, the update step length is set by using the model parameter of the last frame and using the linear interpolation method>Updating the tracker, sampling and combining model parameters at the past momentThe number is added to the updating process of the model parameters by using a bilinear interpolation method:
in the method, in the process of the invention,representing a new training sample set,/->Representing update step size, +.>Representing the old sample set, +.>Representing the filter parameters +.>Representing training sample set, ++>Representing the old training sample set, +.>Representation->Representation in the frequency domain.
Step S4: calculating the intersection ratio of the detection position and the prediction position of the target in the current frame imageThe intersection ratio refers to the ratio of the intersection of the target frame and the detection frame to the area of the union; defining the diagonal coordinates of rectangle A and rectangle B as、/>At the same time, the diagonal coordinates of the intersection rectangle are defined as +.>Then the method of calculating the diagonal coordinates of the intersection rectangle is as follows:
the intersection and union are then calculated as follows:
the schematic diagram is shown in fig. 2.
The above calculation can be used to calculate the intersection ratio of the detected position and the predicted position of the target. If the cross ratio is smaller than the preset threshold, using the target detection position as the target position of the current frame; if the cross ratio is larger than a preset threshold value or the target is detected to be undetected, using the predicted position as the target position of the current frame; updating the KCF target tracker by using the current frame target position and scale information;
step S5: and repeating the steps S3 to S4 for the next frame of image to realize the tracking of the video target.
According to the invention, a fusion target tracking algorithm based on the YOLOv4 target detection algorithm and the KCF rapid tracking algorithm is adopted, optimal results are obtained in detection precision and tracking stability, and the method has strong robustness and instantaneity, so that the method can be used for rapidly, efficiently and reliably tracking and locking the target, and effectively improving the overall performance of the unmanned aerial vehicle system. The foregoing disclosure is illustrative of the present invention and is not to be construed as limiting the scope of the invention, which is defined by the appended claims.
As described above, the present invention can be preferably implemented.
All of the features disclosed in all of the embodiments of this specification, or all of the steps in any method or process disclosed implicitly, except for the mutually exclusive features and/or steps, may be combined and/or expanded and substituted in any way.
The foregoing description of the preferred embodiment of the invention is not intended to limit the invention in any way, but rather to cover all modifications, equivalents, improvements and alternatives falling within the spirit and principles of the invention.
Claims (10)
1. The target tracking method based on artificial intelligence is characterized in that fusion target tracking is performed based on a YOLOv4 target detection algorithm and a KCF rapid tracking algorithm: performing target detection by using a YOLOv4 algorithm; and (3) acquiring a target prediction position by adopting a KCF algorithm, and simultaneously taking the current frame as an input image of a YOLOv4 target detection model to perform target retrieval to obtain accurate target detection position and scale information so as to realize target tracking.
2. The artificial intelligence based object tracking method of claim 1, comprising the steps of:
s1, generating a target detection model: collecting pictures on line and manually labeling targets to form a training data set, and training a YOLOv4 deep learning model by using the training data set to generate a YOLOv4 target detection model;
s2, KCF target tracker training: detecting a target by using a trained YOLOv4 target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker;
s3, target tracking: tracking the target by adopting a KCF target tracker to obtain a predicted position; meanwhile, taking the current frame as an input image of the YOLOv4 target detection model, and carrying out target retrieval to obtain target detection position and scale information;
s4, confirming the target position: calculating the intersection ratio of the detection position and the prediction position of the target in the current frame image, and if the intersection ratio is smaller than a preset threshold value, using the target detection position as the target position of the current frame; if the intersection ratio is larger than a preset threshold value or the target is detected to be undetected, using the predicted position as the target position of the current frame, and using the target position and scale information of the current frame to update the KCF target tracker;
s5, video target tracking: and repeating the steps S3 to S4 for the next frame of image to realize the tracking of the target.
3. The method of claim 2, wherein in step S1, the YOLOv4 deep learning model includes a backbone network CSP network structure, a spatial pyramid pooling layer and path aggregation network, and a YOLO frame header; the backbone network CSP network structure is used for extracting features of an original image and outputting 3-scale feature graphs; the space pyramid pooling layer and the path aggregation network are used for carrying out feature fusion on the feature graphs with 3 scales; the YOLO frame header is used for predicting the feature map after feature fusion.
4. An artificial intelligence based object tracking method according to claim 3, characterised in that step S3 comprises the steps of:
s31, establishing a KCF tracker model: establishing an objective function and an objective of a KCF tracker model;
s32, online matching: obtaining a frequency domain representation of the response value by using the sampling sample and the training sample;
s33, updating a KCF tracker template: the KCF tracker model parameters are updated.
5. The artificial intelligence based object tracking method according to claim 4, wherein in step S31, the object function is established by a ridge regression method:the object is to minimize the distance between the sampled data and the real object position of the next frame, and the expression is:
in the method, in the process of the invention,representing sample variable, ++>Representing the number of sample data>The expression number is->Sample characteristics of->Represents the conjugate transpose->Representing an objective function +.>Representing regularization parameters, ++>Representing column vectors representing weight coefficients, +.>Representing regular items->Representing sample characteristics->Is a tag value of (2);
for a pair ofDifferentiating to make the derivative be 0, and writing the minimum value of the obtained loss function into a complex domain form as follows:
in the method, in the process of the invention,representing row vectors +.>The representation is a column vector; />The transposed matrix of the conjugate complex number of (2) is->;
6. The artificial intelligence based object tracking method of claim 5, wherein the artificial intelligence based object tracking method comprises the following stepsIn step S31, a Gaussian kernel function is introducedWill->Solution to transform into high-dimensional weights +.>Is solved by (1):
7. The artificial intelligence based object tracking method of claim 6, wherein in step S32, the frequency domain of the response value is expressed as:
in the method, in the process of the invention,representing the kernel matrix->First row->Nuclear matrix representing similarity of sampled samples and training samples, which is transformed by Fourier inversion>Conversion from frequency domain to time domain->Find->The maximum value is the position corresponding to the target.
8. The method according to claim 7, wherein in step S33, the model parameters at the past time are sampled and combined, and the update process of adding the model parameters by using the bilinear interpolation method updates the formula as follows:
in the method, in the process of the invention,representing a new training sample set,/->Representing update step size, +.>Representing the old sample set, +.>Representing the filter parameters +.>Representing training sample set, ++>Representing the old training sample set, +.>Representation->Representation in the frequency domain.
9. The method according to any one of claims 2 to 8, wherein in step S4, a calculation formula of an intersection ratio of the detected position and the predicted position of the target is:
in the method, in the process of the invention,indicating the cross ratio of the detected position and the predicted position of the target,/->Surface representing target frameAccumulation of pathogenic qi>Indicating the area of the detection frame +.>An abscissa representing the vertex on the target frame located on the rectangle where the target frame intersects the detection frame,/->Ordinate representing the vertex on the rectangle on the target frame intersecting the detection frame,/->An abscissa representing a vertex on the target frame that is on the same diagonal as a vertex on a rectangle that intersects the target frame and the detection frame, +.>An ordinate representing the vertex on the target frame that is on the same diagonal as the vertex on the rectangle where the target frame intersects the detection frame, +.>An abscissa representing the vertex on the rectangle on the detection frame where the target frame intersects the detection frame,/->Representing the ordinate of the vertices on the detection frame that lie on the rectangle where the target frame intersects the detection frame,an abscissa representing the vertex on the detection frame that is on the same diagonal as the vertex on the rectangle that intersects the target frame and the detection frame, +.>Representing the rectangle on the detection frame intersecting the target frame and the detection frameIs on the ordinate of the vertex of the same diagonal,/->An abscissa representing one diagonal point of a rectangle where the target frame intersects the detection frame, +.>Ordinate representing one diagonal point of a rectangle where the target frame intersects the detection frame, +.>Another diagonal point of the rectangle representing the intersection of the target frame and the detection frame, abscissa, +.>Another diagonal point of the rectangle representing the intersection of the target frame and the detection frame, ordinate, +.>Representing the length of the target frame +.>Representing the width of the target frame +.>Indicating the length of the detection frame, < >>Indicating the width of the detection frame.
10. An artificial intelligence based object tracking system for implementing an artificial intelligence based object tracking method according to any one of claims 1 to 9, comprising the following modules connected in sequence:
the target detection model generation module: the method comprises the steps of collecting pictures offline and manually labeling targets to form a training data set, and training a YOLOv4 deep learning model by using the training data set to generate a YOLOv4 target detection model;
KCF target tracker training module: the method comprises the steps of detecting a target by using a trained YOLOv4 target detection model, acquiring target position and scale information, initializing a KCF target tracker, and training the KCF target tracker;
a target tracking module: the method comprises the steps of tracking a target by adopting a KCF target tracker to obtain a predicted position; meanwhile, taking the current frame as an input image of the YOLOv4 target detection model, and carrying out target retrieval to obtain target detection position and scale information;
a target position confirmation module: the method comprises the steps of calculating the intersection ratio of a detection position and a prediction position of a target in a current frame image, and taking the target detection position as the target position of the current frame if the intersection ratio is smaller than a preset threshold; if the intersection ratio is larger than a preset threshold value or the target is detected to be undetected, using the predicted position as the target position of the current frame, and using the target position and scale information of the current frame to update the KCF target tracker;
video target tracking module: the method is used for repeatedly tracking the target and confirming the position of the target for the next frame of image, so that the target is tracked.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310653926.XA CN116385498A (en) | 2023-06-05 | 2023-06-05 | Target tracking method and system based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310653926.XA CN116385498A (en) | 2023-06-05 | 2023-06-05 | Target tracking method and system based on artificial intelligence |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116385498A true CN116385498A (en) | 2023-07-04 |
Family
ID=86977282
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310653926.XA Pending CN116385498A (en) | 2023-06-05 | 2023-06-05 | Target tracking method and system based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116385498A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111292355A (en) * | 2020-02-12 | 2020-06-16 | 江南大学 | Nuclear correlation filtering multi-target tracking method fusing motion information |
CN111582349A (en) * | 2020-04-30 | 2020-08-25 | 陕西师范大学 | Improved target tracking algorithm based on YOLOv3 and kernel correlation filtering |
-
2023
- 2023-06-05 CN CN202310653926.XA patent/CN116385498A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111292355A (en) * | 2020-02-12 | 2020-06-16 | 江南大学 | Nuclear correlation filtering multi-target tracking method fusing motion information |
CN111582349A (en) * | 2020-04-30 | 2020-08-25 | 陕西师范大学 | Improved target tracking algorithm based on YOLOv3 and kernel correlation filtering |
Non-Patent Citations (2)
Title |
---|
刘伟: "基于视觉的无人机识别与跟踪技术研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》, no. 1, pages 031 - 816 * |
解滋坤等: "一种基于Yolo V4-tiny和KCF的目标跟踪融合算法分析", 《电子技术》, vol. 51, no. 10, pages 309 - 311 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110135366B (en) | Shielded pedestrian re-identification method based on multi-scale generation countermeasure network | |
CN110232350B (en) | Real-time water surface multi-moving-object detection and tracking method based on online learning | |
CN109543606A (en) | A kind of face identification method that attention mechanism is added | |
CN110533691B (en) | Target tracking method, device and storage medium based on multiple classifiers | |
CN113012203A (en) | High-precision multi-target tracking method under complex background | |
CN112364931B (en) | Few-sample target detection method and network system based on meta-feature and weight adjustment | |
Chen et al. | Learning linear regression via single-convolutional layer for visual object tracking | |
CN111814661A (en) | Human behavior identification method based on residual error-recurrent neural network | |
CN110084165A (en) | The intelligent recognition and method for early warning of anomalous event under the open scene of power domain based on edge calculations | |
CN112183675B (en) | Tracking method for low-resolution target based on twin network | |
CN110310305B (en) | Target tracking method and device based on BSSD detection and Kalman filtering | |
CN116524062B (en) | Diffusion model-based 2D human body posture estimation method | |
CN115937254B (en) | Multi-aerial flying target tracking method and system based on semi-supervised learning | |
CN111898566B (en) | Attitude estimation method, attitude estimation device, electronic equipment and storage medium | |
CN111402303A (en) | Target tracking architecture based on KFSTRCF | |
CN116630376A (en) | Unmanned aerial vehicle multi-target tracking method based on ByteTrack | |
Zhang et al. | Unsupervised depth estimation from monocular videos with hybrid geometric-refined loss and contextual attention | |
CN115359407A (en) | Multi-vehicle tracking method in video | |
Song et al. | 2d lidar map prediction via estimating motion flow with gru | |
CN113920168B (en) | Image tracking method in audio/video control equipment | |
CN113033356B (en) | Scale-adaptive long-term correlation target tracking method | |
CN112991394B (en) | KCF target tracking method based on cubic spline interpolation and Markov chain | |
CN116386089B (en) | Human body posture estimation method, device, equipment and storage medium under motion scene | |
Zhu et al. | A moving infrared small target detection method based on optical flow-guided neural networks | |
CN110689559A (en) | Visual target tracking method based on dense convolutional network characteristics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20230704 |