CN110751096B - Multi-target tracking method based on KCF track confidence - Google Patents

Multi-target tracking method based on KCF track confidence Download PDF

Info

Publication number
CN110751096B
CN110751096B CN201911002819.0A CN201911002819A CN110751096B CN 110751096 B CN110751096 B CN 110751096B CN 201911002819 A CN201911002819 A CN 201911002819A CN 110751096 B CN110751096 B CN 110751096B
Authority
CN
China
Prior art keywords
track
target
confidence
frame
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911002819.0A
Other languages
Chinese (zh)
Other versions
CN110751096A (en
Inventor
杨红红
吴晓军
张玉梅
李菁菁
裴炤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Normal University
Original Assignee
Shaanxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaanxi Normal University filed Critical Shaanxi Normal University
Priority to CN201911002819.0A priority Critical patent/CN110751096B/en
Publication of CN110751096A publication Critical patent/CN110751096A/en
Application granted granted Critical
Publication of CN110751096B publication Critical patent/CN110751096B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20056Discrete and fast Fourier transform, [DFT, FFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The multi-target tracking method based on the KCF track confidence coefficient comprises the following steps: s100: establishing a KCF filter, and calculating appearance similarity, shape similarity and motion similarity between a detection response in a current frame and a target tracking track by using the filter as an incidence relation model of data association; s200: correcting the detection response by using a filter; s300: calculating the track confidence coefficient of the tracked target by using a track confidence coefficient calculation method based on APCE occlusion analysis, and dividing the target tracking track into a high confidence coefficient track and a low confidence coefficient track according to the track confidence coefficient; s400: establishing a candidate target hypothesis set for representing the lost targets in the previous frame and target tracks which are not matched with the detection response; s500: and performing data correlation between adjacent frames on the corrected detection response, the high confidence track, the low confidence track and the candidate samples in the candidate target hypothesis set according to the incidence relation model.

Description

Multi-target tracking method based on KCF track confidence
Technical Field
The disclosure belongs to the field of video information processing and analysis and computer vision, and particularly relates to a multi-target tracking method based on KCF track confidence.
Background
Multi-object tracking (MOT) is a research focus in the field of computer vision, and is widely applied in the industries of video monitoring, traffic safety, automobile driving assistance systems, robot navigation positioning and the like.
The goal of multi-target tracking is to identify relevant objects in a video surveillance scene and estimate their locations in the video sequence. At present, a plurality of target tracking methods are available in a video monitoring scene, but due to occlusion, missing detection, false detection, camera shake and the like, multi-target tracking in a complex scene is still a difficult problem.
Currently, the mainstream multi-target tracking method mainly follows a tracking-by-Tracking (TBD) framework, which takes the detection response provided by the target detector as input, and connects the detection responses by association between different frames of the video sequence to obtain a final track. Thus, according to the paradigm of TBD, the whole process of MOT can be divided into two modules: the device comprises a detection module and a tracking module. In the multi-target tracking based on the TBD paradigm, detection responses are provided in advance by a target detector, and the provided detection responses have the problems of false detection and missing detection. In addition, the data association model in the tracking module has the problems of inaccurate modeling and easy generation of tracking errors. Therefore, the TBD paradigm-based multi-target tracking algorithm generally has the problems that the initial detection result has great influence on the tracking performance and the data association method causes tracking errors.
However, most existing MOT methods based on the TBD paradigm are primarily focused on the tracking module, including data correlation and model optimization. Such as Multi-Hypothesis-based Multi-target tracking (MHT), Joint Probabilistic Data Association (JPDA) based Multi-target tracking, network flow framework-based Multi-target tracking, and learning-based Multi-target tracking methods. In these multi-target tracking algorithms, the appearance, shape, and position information of the target are usually used to improve the performance of multi-target tracking and enhance the robustness of the tracking module. However, of these methods, few involve a method of taking into account and compensating for the missing detection and the false detection caused by the detector. Therefore, the above problem is one of the major drawbacks of the MOT method based on the TBD paradigm.
In recent years, a Single Object Tracking (SOT) method has achieved a significant effect in the aspect of appearance model learning. Therefore, applying the SOT tracking method to MOT tracking helps to improve the tracking performance of MOT. Many tracking methods follow this idea, directly introducing SOT into MOT in an attempt to improve MOT tracking performance. However, the samples used by the SOT in learning the appearance of the target are learned online based on the tracking results, which contain a large number of noise samples. Secondly, in a video monitoring scene, mutual occlusion among multiple targets is serious. Further, in the process of introducing the SOT to the MOT, since the SOT requires a newly appeared target to be added to the MOT tracking system in real time, directly applying the SOT to the MOT will cause a problem that the calculation cost increases exponentially as the number of tracking targets increases.
Disclosure of Invention
In view of this, the present disclosure provides a multi-target tracking method based on KCF trajectory confidence, including the following steps:
s100: establishing a filter based on KCF, and calculating appearance similarity, shape similarity and motion similarity between N detection responses in a current frame and M item mark tracking tracks by using the filter based on KCF as an incidence relation model of data association, wherein N and M are integers more than 1;
s200: correcting the N detection responses with the filter;
s300: calculating the track confidence coefficient of the tracked target by using a track confidence coefficient calculation method based on APCE occlusion analysis, and dividing the M item target tracking tracks into high confidence coefficient tracks and low confidence coefficient tracks according to the track confidence coefficient;
s400: establishing a candidate target hypothesis set for representing the lost targets in the previous frame and target tracks which are not matched with the N detection responses;
s500: and performing data association between adjacent frames on the corrected detection response, the high confidence track, the low confidence track and the candidate samples in the candidate target hypothesis set according to the association relation model.
By the technical scheme, the method can obviously reduce the calculation complexity of using SOT in MOT; the correlation filter is used for measuring the similarity between the detection response and the tracking target and further correcting the detection error problem caused by the detector; the trajectory confidence degree calculation method based on APCE occlusion analysis and establishment of candidate target hypothesis sets (CTH) further improve the performance of multi-target tracking data association. The method effectively improves the robustness and tracking precision of target tracking.
Drawings
FIG. 1 is a flow chart of a multi-target tracking method based on KCF trajectory confidence provided in an embodiment of the present disclosure;
fig. 2 is a graph illustrating the tracking performance of different components on MOT 2015 validation set in an embodiment of the disclosure.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
In one embodiment, referring to fig. 1, a multi-target tracking method based on KCF track confidence is disclosed, comprising the following steps:
s100: establishing a filter based on KCF, and calculating appearance similarity, shape similarity and motion similarity between N detection responses in a current frame and M item mark tracking tracks by using the filter based on KCF as an incidence relation model of data association, wherein N and M are integers more than 1;
s200: correcting the N detection responses with the filter;
s300: calculating the track confidence coefficient of the tracked target by using a track confidence coefficient calculation method based on APCE occlusion analysis, and dividing the M item target tracking tracks into high confidence coefficient tracks and low confidence coefficient tracks according to the track confidence coefficient;
s400: establishing a candidate target hypothesis set for representing the lost targets in the previous frame and target tracks which are not matched with the N detection responses;
s500: and performing data association between adjacent frames on the corrected detection response, the high confidence track, the low confidence track and the candidate samples in the candidate target hypothesis set according to the association relation model.
For this embodiment, the method comprises: the method comprises the steps of calculating based on KCF association similarity, correcting based on KCF detection response, performing two-step data association based on APCE track confidence degree and constructing a candidate target hypothesis set based on the data association and KCF. The method comprises the steps of firstly dividing an online MOT tracking task into two parts of establishment of an association model and frame-by-frame data association based on the association model. Secondly, in the construction process of the Correlation model, firstly, the method introduces the SOT based on a kernel Correlation filter algorithm KCF (Kernel Correlation Filter) into the MOT system to capture the context information in online MOT tracking and timely process the problems of false detection and missed detection caused by the detector. Meanwhile, in order to establish a robust correlation model, the method not only measures the similarity between the detection response and the tracking target based on the nuclear correlation filter, but also optimizes the detection response provided by the target detector through the tracking result of the KCF. In addition, in the frame-by-frame data association process, in order to improve the performance of data association, a calculation method based on the confidence of an APCE (amplitude-to-correlation energy) trajectory is introduced as an index of tracking reliability. And then dividing the target tracking track into a high confidence track and a low confidence track according to the track confidence. Meanwhile, the method establishes a candidate target hypothesis set (CTH) which comprises the target which is lost in the previous frame and the target track which is not matched, so as to improve the performance of data association. And finally, performing data correlation between adjacent frames on the high-confidence track, the low-confidence track, the detection response provided by the detector and the candidate samples in the CTH according to a correlation model.
In another embodiment, the S100 further includes:
s101: the established KCF-based filter is a KCF filter trained by only using target samples in a t-1 frame, wherein t represents a current frame;
s102: computing tracked in t-1 frameTarget xlWith the detected response z in the t-th framelThe appearance similarity S betweenappThe method specifically comprises the following steps:
Figure BDA0002241452320000061
wherein the content of the first and second substances,
Figure BDA0002241452320000062
as an output vector ylDiscrete Fourier transform of (y)lFor training samples flDesired output of flHOG and CN characteristics corresponding to the candidate sample;
s103: calculating the tracked target x in the t-1 framelWith the detected response z in the t-th framelThe similarity of the shapes S betweenshapeThe method specifically comprises the following steps:
sshape=IoU(xl,zl);
s104: calculating the tracked target x in the t-1 framelWith the detected response z in the t-th framelDegree of motion similarity between SmotionThe method specifically comprises the following steps:
Smotion=G(Tpos-Zpos,∑)
(12)
wherein G (-) is a Gaussian function with a mean value of 0, TposAnd ZposAre respectively xlAnd zlThe position of (a).
For the embodiment, the method only uses the training sample of the previous frame to train the KCF filter, reduces the influence of error accumulation caused by error samples in the SOT tracking process on the performance of the KCF filter, and meanwhile, obviously reduces the calculation cost of using SOT in MOT.
The construction of a robust association model is an important factor influencing the multi-target tracking performance. Therefore, the method provides a KCF-based correlation model construction method.
In the TBD-based online multi-target tracking framework, one key step is to associate N detection responses in the current frame with M tracks. Suppose that in the t-th frame, N detection ringsShall be expressed as
Figure BDA0002241452320000071
M tracks are represented as
Figure BDA0002241452320000072
Figure BDA0002241452320000073
Represents the j-th track TjAssociated detection response, tsAnd teRepresenting a track TjThe start and end frames of the frame are,
Figure BDA0002241452320000074
indicates the sum track T in the k-th framejAn associated detection response. Here TjTracks are shown, however, the jth track TjIs responsive to detection in successive frames
Figure BDA0002241452320000075
And (3) forming.
The similarity calculation between tracks is usually to calculate the similarity between the detection response and the tracks based on certain characteristics, such as appearance, position and shape; and then obtaining a final association model according to the product of the similarity of different features.
Establishing a KCF-based filter, wherein the idea of the KCF-based filter is as follows: and calculating the response value of the correlation filter according to the position of the tracked target in the current frame. Correlation is a measure of the similarity of two signals, the more similar the two signals are, the higher the correlation. Therefore, in the tracking application, the objective of KCF training is to design a filtering template such that the response obtained when it acts on the tracked object is maximized.
Assuming that the position of the target in the current frame is x, the training sample acquired according to the position by the cyclic shift method is xi(W, H) e {0, …, W-1} × {0, …, H-1}, for any sample xiCalculating the corresponding label y according to the Gaussian functioni(w,h),yi∈[0,1]. When the sample xiIs at the eyeWhen the center position is marked, the response value is maximum yiOn the contrary, when the sample x is 1iWhen the target center position is far away, the response becomes small. Therefore, the objective of KCF training is to find the function f (z) ═ ωTz minimizes the error function, where the defined error function is:
Figure 1
in the formula (1), lambda is a regularization coefficient, and omega is a model parameter solution in the formula (1).
The method introduces a nonlinear mapping function by introducing a linear inseparable point in a low-dimensional space into a high-dimensional Hilbert space to perform division
Figure BDA0002241452320000082
Equation (1) can be written as follows:
Figure 2
in formula (2)
Figure BDA0002241452320000084
Is a kernel function of
Figure BDA0002241452320000085
Is used as the non-linear mapping function.
According to dual space theory, ω can be expressed as a weighted sum of the nonlinear mappings of the input samples:
Figure BDA0002241452320000091
therefore, the solution in equation (2) is changed from the vector ω to the vector α (α) in the solution dual space1…,αn). Taking equation (3) into equation (2) to solve the discrete fourier transform of vector α according to the property of the inner product of the mapping function to diagonalize:
Figure BDA0002241452320000092
in the formula (4), F represents Fourier transform, and y ═ yi(W, H) | (W, H) ∈ {0, …, W-1} × {0, …, H-1} } is a sample label, k is a sample labelxK (x, x') is the self-kernel function of the vector x, and the method adopts a gaussian kernel function.
In the target tracking stage, for candidate image blocks Z in a t +1 frame, according to a model parameter F (alpha) trained before, calculating the correlation between the candidate image blocks Z and omega, a dual vector alpha, and obtaining a solution of Fourier transform of a target response value:
Figure BDA0002241452320000093
in equation (5), lines indicate dot products between vectors,
Figure BDA0002241452320000094
is Z and target appearance
Figure BDA0002241452320000101
Nuclear correlations between.
And finally, determining the position of the target in the current frame according to the maximum position of the correlation response image:
Figure BDA0002241452320000102
wherein R is the maximum response value.
Calculating the appearance similarity based on KCF: the idea of calculating based on KCF appearance similarity in the method is to avoid the influence of noise samples on model training in the error accumulation and tracking processes, and train a KCF filter model by only using target samples in the t-1 frame.
Therefore, in multi-target tracking, for an arbitrary tracking target x in the t-1 th framelSampling candidate samples x of size WxH according to the cyclic shift principlelW and H respectively represent candidate samples xlThe width and height of (c), and their corresponding HOG (histogram of organized gradient) and CN (color names) characteristics are represented as flThen, the corresponding KCF filtering model applied in the MOT tracking method is:
Figure BDA0002241452320000103
y in formula (7)lFor training samples flThe solution of equation (7) is solved according to equations (2-5).
In the multi-target tracking stage, for any detection response z in the t-th framelThe corresponding response diagram is calculated by the KCF filter in equation (7):
Figure BDA0002241452320000111
wherein the content of the first and second substances,
Figure BDA0002241452320000112
Figure BDA0002241452320000113
as an output vector ylDiscrete fourier transform of (d).
Therefore, the maximum response value of the formula (8) is used to calculate the tracked target x in the t-1 frame in the MOT tracking processlWith the detected response z in the t-th framelThe appearance similarity S betweenapp
Figure BDA0002241452320000114
Calculating the shape similarity based on KCF:
for any object x tracked in the t-1 framelIn other words, the predicted position in the t-th frame is represented by the maximum response value in equation (8)
Figure BDA0002241452320000115
And (6) determining. Due to the fact thatHere, the method defines a target xlPosition p in the t-th frame is:
Figure BDA0002241452320000121
in the formula (10)
Figure BDA0002241452320000123
Is a target xlAt the position of the t-1 th frame.
In the method, a tracked target x in a t-1 frame is definedlWith the detected response z in the t-th framelThe similarity of the shapes S betweenshapeThe target bounding box predicted by the KCF filter and the IOU (interaction-over-unity) value of the detection response are determined as follows:
Figure BDA0002241452320000122
and (3) calculating motion similarity: for arbitrary tracking target xlAnd the detection response zlThe motion similarity is defined as:
Smotion=G(Tpos-Zpos,∑)
(12)
in formula (12), G (-) is a Gaussian function with a mean value of 0, TposAnd ZposAre respectively the target xlAnd the detection response zlThe position of (a).
In another embodiment, the S200 further includes:
s201: predicting arbitrary targets in t-1 frames using the KCF-based filter
Figure BDA0002241452320000131
State in the tth frame:
Figure BDA0002241452320000132
wherein the content of the first and second substances,
Figure BDA0002241452320000133
to predict the position information of the target using the KCF-based filter,
Figure BDA0002241452320000134
and
Figure BDA0002241452320000135
respectively the width and height of the predicted target bounding box,
Figure BDA0002241452320000136
is the position of the jth track in the t-1 frame
Figure BDA0002241452320000137
η is a predefined threshold, CTH is a candidate target hypothesis set;
s202: when the maximum response value of the predicted target is larger than a predefined threshold eta, adding the maximum response value into the predicted target set
Figure BDA0002241452320000138
Otherwise, adding the target sequence into a CTH candidate target hypothesis set;
s203: assume that all the predicted targets in the t-th frame that satisfy the above equation are
Figure BDA0002241452320000139
The detection response of the t-th frame is:
Figure BDA00022414523200001310
wherein M represents the number of predicted targets,
Figure BDA00022414523200001311
representing the detection response provided by the target detector,
Figure BDA00022414523200001312
indicating a detection responseStatus of state
Figure BDA00022414523200001313
And target predicted state
Figure BDA00022414523200001314
Redundancy elimination based on IoU, the judgment is based on
Figure BDA00022414523200001315
And
Figure BDA00022414523200001316
whether the value of IoU is greater than a predefined threshold a;
s204: when in use
Figure BDA0002241452320000141
And
Figure BDA0002241452320000142
when the value of IoU is larger than the predefined threshold value alpha, the same target is represented, only the detection response is reserved, otherwise, the detection response and the reservation represent different target detection responses, and finally, the detection response of the t frame after being corrected by the KCF is obtained
Figure BDA0002241452320000143
In the embodiment, in the MOT tracking process, a KCF filter is used for capturing context information in online MOT tracking, the similarity between a detection response and a target track is measured, the detection response provided by a detector is corrected, and the problems of false detection and missed detection caused by the detector are timely processed.
For the first frame image, no detection response correction is needed, and by default the detection response in the first frame is true. And (5) from the second frame image, training a KCF filter according to the tracking result of the previous frame, then correcting, and establishing an association relation model after the correction is completed.
An Intersection Over Union (IOU) is a standard that measures the accuracy of detecting a corresponding object in a particular data set. IoU is a simple measurement criterion, and IoU can be used to measure any task that yields a predicted range (bounding boxes) in the output.
In another embodiment, the predefined threshold η is 0.7 and the predefined threshold α is 0.6.
For this embodiment, the predefined threshold is a numerical value between 0 and 1, the specific value being determined according to experimental effects.
In another embodiment, interdigitation between occlusion and target is a common phenomenon in MOT tracking. Occlusion tends to cause the response pattern of the KCF filter to fluctuate. Therefore, the method introduces an APCE (average peak-to-correlation energy) value to calculate the fluctuation degree of the response map, and the fluctuation degree is used as an occlusion index to measure the occlusion degree in MOT tracking. If the tracked object is not occluded, the APCE value will be large and the corresponding response diagram is unimodal, whereas when the tracked object is occluded, its response diagram will fluctuate dramatically and the APCE value will also decrease significantly. APCE is defined as:
Figure BDA0002241452320000151
in the formula
Figure BDA0002241452320000155
Represents the response value of the ith sample,
Figure BDA0002241452320000152
and
Figure BDA0002241452320000153
respectively, the maximum and minimum response values in the formula (7).
The APCE is normalized by utilizing the historical value of the tracking target APCE and is used as an index O for measuring the shielding degree of the targetAPCE
Figure BDA0002241452320000154
In the formula, tsIs the start frame index of the tracked target track, and t is the video frame index. Obtaining the shielding degree index O of any track in the t frame through calculationAPCE
In another embodiment, the S300 further includes:
the trajectory confidence is defined as:
Figure BDA0002241452320000161
l in the formulajIs TjLength of track of C (T)j,zi) Is a track TjAnd the detection response ziCorrelation similarity value of OAPCEIn order to be able to block the degree of shading,
Figure BDA0002241452320000162
respectively represent the track TjA start frame and an end frame.
In another embodiment, when conf (T)j) If the confidence coefficient is more than 0.5, the corresponding track is a track with high confidence coefficient, otherwise, the track with low confidence coefficient is obtained.
For the embodiment, the target occlusion is measured based on KCF filtering, APCE indexes are introduced to measure the occlusion degree of the target and calculate the confidence coefficient of the track, and the confidence coefficient is used as the index of the tracking reliability. The tracking track is divided into a high confidence track and a low confidence track for data association, so that the influence of tracking errors on the data association is reduced.
Suppose there are N candidate detection responses in the t-th frame after the detection response correction
Figure BDA0002241452320000163
And M tracks
Figure BDA0002241452320000164
Wherein the content of the first and second substances,
Figure BDA0002241452320000165
Figure BDA0002241452320000171
Figure BDA0002241452320000172
for the location of the ith detection response,
Figure BDA0002241452320000173
indicating the size of its corresponding detection bounding box,
Figure BDA0002241452320000174
and
Figure BDA0002241452320000175
indicating the location and confidence of the jth trace,
Figure BDA0002241452320000176
the bounding box size corresponding to the jth track. The main task of online multi-target tracking is to correlate the detection response in frame t with the trajectory generated in frame t-1, thereby generating the current target trajectory in frame t.
Due to occlusion and detection response errors, trackers sometimes fail to establish a correct tracking trajectory. A longer track with high confidence is usually a more reliable track. Therefore, the method calculates the confidence of the track by introducing the shielding degree, the track length and the associated similarity, and then divides the track into high-confidence tracks ThighAnd low confidence trajectory TlowAnd carrying out data association.
In another embodiment, the S500 further includes:
the data association between the adjacent frames executed by the high-confidence track is specifically as follows:
the incidence relation matrix is constructed as follows:
Figure BDA0002241452320000177
Figure BDA0002241452320000178
in the formula, SijIs a track
Figure BDA0002241452320000181
And the detection response ZiBased on appearance similarity SappShape similarity SshapeAnd motion similarity SmotionS is the correlation similarity value of SijForming a incidence relation matrix, and assuming that k high-confidence tracks T exist in the T-th framehigh
Figure BDA0002241452320000182
Contains n detection responses, a candidate target hypothesis set
Figure BDA0002241452320000183
There are m detection responses;
and according to the constructed incidence relation matrix, solving the incidence matching between the high-confidence track and the detection response by a Hungarian algorithm, and then updating the confidence value of the track and the state of the track according to the incidence result.
For this embodiment, the high confidence tracks are first data-correlated, which involves ThighAnd detecting the response
Figure BDA0002241452320000184
The association relationship matrix is constructed based on the appearance similarity, shape similarity, and motion similarity of the trajectory and the detection response.
In another embodiment, the S500 further includes:
the data association between the adjacent frames executed by the low-confidence track is specifically as follows:
after associating high confidence tracks with detection responses, assumptions are made
Figure BDA0002241452320000185
There are collectively n' unassociated detection responses, candidate targetsAssuming there are m ' unassociated detection responses and q ' unassociated tracks in the set, k ' unassociated high-confidence tracks Thigh
Figure BDA0002241452320000191
Representing candidate detection responses in the t-th frame
Figure BDA0002241452320000192
Assuming that l tracks with low confidence exist in the t-th frame, h unmatched tracks, h ═ q '+ k', q detection responses, and q ═ n '+ m'; then the incidence matrix in the low confidence track data association process is:
Figure BDA0002241452320000193
wherein A ═ aij],
Figure BDA0002241452320000194
Represents the ith low confidence track
Figure BDA0002241452320000195
With jth high confidence trace
Figure BDA0002241452320000196
The correlation similarity value of (a); b ═ Bij],
Figure BDA0002241452320000197
Represents the ith low confidence track
Figure BDA0002241452320000198
With the jth detection response cjThe correlation similarity value of (a); d ═ diag [ D1,…,dl],
Figure BDA0002241452320000199
For the ith low confidence track
Figure BDA00022414523200001910
The probability of termination; tau is a predefined threshold, X is an incidence relation matrix formed by A, B, D and tau in the low confidence coefficient track association process;
and solving the association problem of the low-confidence track by a Hungarian algorithm according to the constructed association relation matrix, and then updating the confidence value of the track and the state of the track according to the association result.
For this embodiment, there are only three associated states due to the low confidence trajectory: associated with a high confidence trace, or associated with a detection response, or terminated. Therefore, the method further detects whether the detection response and the track are associated with a track with low confidence degree.
In another embodiment, the updating of the set of candidate target hypotheses. In the multi-target tracking, in order to overcome the influence of false detection and missed detection, the method associates data in the t-th frame
Figure BDA0002241452320000201
The unmatched detection responses are merged into the candidate target hypothesis CTH, which is kept as a potential trace. Meanwhile, to avoid erroneous tracking, unmatched high confidence tracks and low confidence tracks are also added to the CTH set. In addition, in order to save computation time and space,
Figure BDA0002241452320000202
if the candidate target in (1) is not associated with any detection response or trajectory in the consecutive frames (set to 6 frames), it is subjected to discarding processing. Pass through pair
Figure BDA0002241452320000203
After the concentrated track and the adding and deleting work of the detection response, the CTH set of the t frame is obtained
Figure BDA0002241452320000204
In another embodiment, as shown in fig. 2, P1 is the MOT tracking method of the present method with the elimination of the correction part based on KCF detection response, P2 is the MOT tracking method of the present method with the elimination of the occlusion analysis part based on KCF, where IOU (interference-over-unity) values of the tracking target and the detection response replace the correlation calculation part based on KCF in the present method, P3 is the MOT tracking method of the present method with the elimination of the occlusion analysis part based on KCF, P4 is the MOT tracking method of the present method with the elimination of the candidate target hypothesis set part, and Ours is the tracking method of the present method including all steps. Table 1 shows the evaluation results of P1, P2, P3, P4 and the method based on the multi-target tracking evaluation indexes on the MOT 2015 validation set, where the evaluation indexes include: MOTP (multiple Object Tracking precision), MOTA (multiple Object Tracking Access), MT (Mostly-Tracked), ML (Mostly-Lost), FP (false Positive), FN (false negative) and IDs (ID-switch). In table 1, the higher the index of the bands (heel) is, the better the performance is; for the index with (↓) the lower the value, the better the performance.
Tracker MOTA(%)↑ MOTP(%)↑ MT(%)↑ ML(%)↓ FP↓ FN↓ IDs↓
P1 19.8 73.9 8.5 68.6 3309 15017 265
P2 24.7 73.8 10.3 53.4 3197 13367 134
P3 23 73.6 9.8 56 3548 13651 147
P4 22.1 73.5 9.8 53.8 3727 14958 138
Our 28.6 73.9 13.3 52.4 2691 13491 123
TABLE 1
As can be seen from the evaluation results of FIG. 2 and Table 1, all the components of the method are helpful for improving the tracking accuracy of the multi-target tracking method, and the tracking accuracy of the P1-P4 tracking method and MOTA indexes are lower than those of the method. The P1 tracking method has the MOTA index obviously reduced and the IDs index obviously increased due to the lack of a response correction part based on KCF detection. In the P2 tracking method, as the incidence relation calculation method based on the KCF filter is replaced, the MOTA index is reduced, and the ID index is increased, the phenomenon further illustrates that the method based on the KCF similarity calculation can better learn the appearance information of the tracking target, and the provided strategy based on the KCF detection response correction can eliminate the problems of missed detection and false detection caused by the detector to a certain extent. From the tracking performance of the P3 tracking method, the APCE-based trajectory confidence degree calculation is an important component of the method, and the lack of the component obviously reduces the tracking accuracy MOTA of the tracking algorithm, so that the APCE-based trajectory confidence degree calculation method provided in the method is helpful for improving the data association performance. In the P4 tracking method, because a candidate target hypothesis set is missing, the MOTA is significantly decreased, and the IDS is significantly increased, further indicating that the candidate target hypothesis set can handle tracking errors and detection response errors to a certain extent, which is helpful for improving the tracking accuracy. Therefore, each part proposed by the method comprises the association similarity calculation based on the KCF, the detection response correction based on the KCF, the two-step data association based on the APCE track confidence coefficient and the candidate target hypothesis set based on the data association and the KCF, which are all beneficial to improving the tracking accuracy of multi-target tracking.
Although the embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments and application fields, and the above-described embodiments are illustrative, instructive, and not restrictive. Those skilled in the art, having the benefit of this disclosure, may effect numerous modifications thereto without departing from the scope of the invention as defined by the appended claims.

Claims (7)

1. A multi-target tracking method based on KCF track confidence coefficient comprises the following steps:
s100: establishing a filter based on KCF, and calculating appearance similarity, shape similarity and motion similarity between N detection responses in a current frame and M item mark tracking tracks by using the filter based on KCF as an incidence relation model of data association, wherein N and M are integers more than 1;
s200: correcting the N detection responses with the filter;
s300: calculating the track confidence coefficient of the tracked target by using a track confidence coefficient calculation method based on APCE occlusion analysis, and dividing the M item target tracking tracks into high confidence coefficient tracks and low confidence coefficient tracks according to the track confidence coefficient;
s400: establishing a candidate target hypothesis set for representing the lost targets in the previous frame and target tracks which are not matched with the N detection responses;
s500: performing data association between adjacent frames on the corrected detection response, the high confidence trajectory, the low confidence trajectory and the candidate samples in the candidate target hypothesis set according to the association relation model;
the S500 further includes:
the data association between the adjacent frames executed by the high-confidence track is specifically as follows:
the incidence relation matrix is constructed as follows:
Figure FDA0003394357400000011
Figure FDA0003394357400000012
in the formula, SijIs a track
Figure FDA0003394357400000013
And the detection response ziBased on appearance similarity sappShape similarity SshapeAnd motion similarity SmotionS is the correlation similarity value of SijForming a incidence relation matrix, and assuming that k high-confidence tracks T exist in the T-th framehigh
Figure FDA0003394357400000021
Contains n detection responses, a candidate target hypothesis set
Figure FDA0003394357400000022
There are m detection responses;
according to the constructed incidence relation matrix, solving the incidence matching between the high-confidence track and the detection response by a Hungarian algorithm, and then updating the confidence value of the track and the state of the track according to the incidence result;
the data association between the adjacent frames executed by the low-confidence track is specifically as follows:
after associating high confidence tracks with detection responses, assumptions are made
Figure FDA0003394357400000023
There are n 'unassociated detection responses in the set, m' unassociated detection responses and q 'unassociated trajectories in the set of candidate target hypotheses, k' unassociated trajectoriesAssociated high confidence trajectory Thigh
Figure FDA0003394357400000024
Representing the detection response of the t frame after KCF correction;
assuming that l tracks with low confidence exist in the t-th frame, h unmatched tracks, h ═ q '+ k', q detection responses, and q ═ n '+ m'; then the incidence matrix in the low confidence track data association process is:
Figure FDA0003394357400000025
wherein A ═ aij],
Figure FDA0003394357400000031
Represents the ith low confidence track
Figure FDA0003394357400000032
With j-th high-confidence track
Figure FDA0003394357400000033
Associating the similarity values; b ═ Bij],
Figure FDA0003394357400000034
Represents the ith low confidence track
Figure FDA0003394357400000035
With the jth detection response cjThe correlation similarity value of (a); d ═ diag [ D1,…,dl],
Figure FDA0003394357400000036
For the ith low confidence track
Figure FDA0003394357400000037
The probability of termination; tau isThe method comprises the following steps that a predefined threshold value is adopted, and X is an incidence relation matrix formed by A, B, D and tau in a low confidence coefficient track association process;
and solving the association problem of the low-confidence track by a Hungarian algorithm according to the constructed association relation matrix, and then updating the confidence value of the track and the state of the track according to the association result.
2. The method of claim 1, the S100 further comprising:
s101: the established KCF-based filter is a KCF filter trained by only using target samples in a t-1 frame, wherein t represents a current frame;
s102: calculating the tracked target x in the t-1 framelWith the detected response z in the t-th framelThe appearance similarity S betweenappThe method specifically comprises the following steps:
Figure FDA0003394357400000038
wherein the content of the first and second substances,
Figure FDA0003394357400000041
as an output vector ylDiscrete Fourier transform of (y)lFor training samples flDesired output of flHOG and CN characteristics corresponding to the candidate sample;
s103: calculating the tracked target x in the t-1 framelWith the detected response z in the t-th framelThe similarity of the shapes S betweenshapeThe method specifically comprises the following steps:
Sshape=IoU(xl,zl);
s104: calculating the tracked target x in the t-1 framelWith the detected response z in the t-th framelDegree of motion similarity between SmotionThe method specifically comprises the following steps:
Smotion=G(Tpos-zpos,∑) (12)
wherein G (-) is a Gaussian function with a mean value of 0, TposAnd zposAre respectively xlAnd zlThe position of (a).
3. The method of claim 1, the S200 further comprising:
s201: predicting arbitrary targets in t-1 frames using the KCF-based filter
Figure FDA0003394357400000042
State in the tth frame:
Figure FDA0003394357400000043
wherein the content of the first and second substances,
Figure FDA0003394357400000051
to predict the position information of the target using the KCF-based filter,
Figure FDA0003394357400000052
and
Figure FDA0003394357400000053
respectively the width and height of the predicted target bounding box,
Figure FDA0003394357400000054
is the position of the jth track in the t-1 frame
Figure FDA0003394357400000055
η is a predefined threshold, CTH is a candidate target hypothesis set;
s202: when the maximum response value of the predicted target is larger than a predefined threshold eta, adding the maximum response value into the predicted target set
Figure FDA0003394357400000056
Otherwise, adding the target sequence into a CTH candidate target hypothesis set;
s203: suppose thatAll the predicted targets in the t-th frame satisfying the above formula are
Figure FDA0003394357400000057
The detection response of the t-th frame is:
Figure FDA0003394357400000058
wherein M represents the number of predicted targets,
Figure FDA0003394357400000059
representing the detection response provided by the target detector,
Figure FDA00033943574000000510
indicating a detection response condition
Figure FDA00033943574000000511
And target predicted state
Figure FDA00033943574000000512
Redundancy elimination based on IoU, the judgment is based on
Figure FDA00033943574000000513
And
Figure FDA00033943574000000514
whether the value of IoU is greater than a predefined threshold a;
s204: when in use
Figure FDA0003394357400000061
And
Figure FDA0003394357400000062
when the value of (c) IoU is greater than the predefined threshold value alpha, it indicates the same target and only the detection response is retained, whereas the two represent different target detection responses while retainingAnd finally obtaining the detection response of the t frame after KCF correction
Figure FDA0003394357400000063
4. The method of claim 3, wherein the predefined threshold η is 0.7.
5. The method of claim 3, wherein the predefined threshold α is 0.6.
6. The method of claim 1, the S300 further comprising:
the trajectory confidence is defined as:
Figure FDA0003394357400000064
l in the formulajIs TjLength of track of C (T)j,zi) Is a track TjAnd the detection response ziCorrelation similarity value of OAPCEIn order to be able to block the degree of shading,
Figure FDA0003394357400000065
respectively represent the track TjA start frame and an end frame.
7. The method of claim 6, when conf (T)j) If the confidence coefficient is more than 0.5, the corresponding track is a track with high confidence coefficient, otherwise, the track with low confidence coefficient is obtained.
CN201911002819.0A 2019-10-21 2019-10-21 Multi-target tracking method based on KCF track confidence Active CN110751096B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911002819.0A CN110751096B (en) 2019-10-21 2019-10-21 Multi-target tracking method based on KCF track confidence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911002819.0A CN110751096B (en) 2019-10-21 2019-10-21 Multi-target tracking method based on KCF track confidence

Publications (2)

Publication Number Publication Date
CN110751096A CN110751096A (en) 2020-02-04
CN110751096B true CN110751096B (en) 2022-02-22

Family

ID=69279244

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911002819.0A Active CN110751096B (en) 2019-10-21 2019-10-21 Multi-target tracking method based on KCF track confidence

Country Status (1)

Country Link
CN (1) CN110751096B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111292355B (en) * 2020-02-12 2023-06-16 江南大学 Nuclear correlation filtering multi-target tracking method integrating motion information
CN111242985B (en) * 2020-02-14 2022-05-10 电子科技大学 Video multi-pedestrian tracking method based on Markov model
CN112639872B (en) * 2020-04-24 2022-02-11 华为技术有限公司 Method and device for difficult mining in target detection
CN111652150B (en) * 2020-06-04 2024-03-19 北京环境特性研究所 Infrared anti-interference tracking method
CN111914625B (en) * 2020-06-18 2023-09-19 西安交通大学 Multi-target vehicle tracking device based on detector and tracker data association
CN111968153A (en) * 2020-07-16 2020-11-20 新疆大学 Long-time target tracking method and system based on correlation filtering and particle filtering
CN112561963A (en) * 2020-12-18 2021-03-26 北京百度网讯科技有限公司 Target tracking method and device, road side equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105809714A (en) * 2016-03-07 2016-07-27 广东顺德中山大学卡内基梅隆大学国际联合研究院 Track confidence coefficient based multi-object tracking method
CN107341820A (en) * 2017-07-03 2017-11-10 郑州轻工业学院 A kind of fusion Cuckoo search and KCF mutation movement method for tracking target
CN107527356A (en) * 2017-07-21 2017-12-29 华南农业大学 A kind of video tracing method based on lazy interactive mode
CN108921873A (en) * 2018-05-29 2018-11-30 福州大学 The online multi-object tracking method of Markovian decision of filtering optimization is closed based on nuclear phase
CN109934849A (en) * 2019-03-08 2019-06-25 西北工业大学 Online multi-object tracking method based on track metric learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105809714A (en) * 2016-03-07 2016-07-27 广东顺德中山大学卡内基梅隆大学国际联合研究院 Track confidence coefficient based multi-object tracking method
CN107341820A (en) * 2017-07-03 2017-11-10 郑州轻工业学院 A kind of fusion Cuckoo search and KCF mutation movement method for tracking target
CN107527356A (en) * 2017-07-21 2017-12-29 华南农业大学 A kind of video tracing method based on lazy interactive mode
CN108921873A (en) * 2018-05-29 2018-11-30 福州大学 The online multi-object tracking method of Markovian decision of filtering optimization is closed based on nuclear phase
CN109934849A (en) * 2019-03-08 2019-06-25 西北工业大学 Online multi-object tracking method based on track metric learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"Learning Spatio-Temporal Information for Multi-Object Tracking";Jian Wei et al.;《IEEE》;20170323;全文 *
"Online Multi-Object Tracking with Instance-Aware Tracker and Dynamic Model Refreshment";Peng Chu et al.;《arXiv》;20190228;全文 *
"基于核相关滤波器的多目标跟踪算法";周海英 等;《激光与光电子学进展》;20181231;全文 *
"基于视频的行人检测与跟踪算法研究";罗招材;《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》;20170215(第02期);全文 *

Also Published As

Publication number Publication date
CN110751096A (en) 2020-02-04

Similar Documents

Publication Publication Date Title
CN110751096B (en) Multi-target tracking method based on KCF track confidence
CN108921873B (en) Markov decision-making online multi-target tracking method based on kernel correlation filtering optimization
CN109360226B (en) Multi-target tracking method based on time series multi-feature fusion
CN109636829B (en) Multi-target tracking method based on semantic information and scene information
CN107516321B (en) Video multi-target tracking method and device
CN114972418B (en) Maneuvering multi-target tracking method based on combination of kernel adaptive filtering and YOLOX detection
CN109118473B (en) Angular point detection method based on neural network, storage medium and image processing system
CN112883819A (en) Multi-target tracking method, device, system and computer readable storage medium
CN113674328A (en) Multi-target vehicle tracking method
CN106934817B (en) Multi-attribute-based multi-target tracking method and device
CN110363165B (en) Multi-target tracking method and device based on TSK fuzzy system and storage medium
CN111582349B (en) Improved target tracking algorithm based on YOLOv3 and kernel correlation filtering
CN110782483A (en) Multi-view multi-target tracking method and system based on distributed camera network
CN112052802B (en) Machine vision-based front vehicle behavior recognition method
CN113192105B (en) Method and device for indoor multi-person tracking and attitude measurement
CN114879695A (en) Track matching method, device, equipment and medium
CN110349188B (en) Multi-target tracking method, device and storage medium based on TSK fuzzy model
CN113255611A (en) Twin network target tracking method based on dynamic label distribution and mobile equipment
CN111429485B (en) Cross-modal filtering tracking method based on self-adaptive regularization and high-reliability updating
CN115345905A (en) Target object tracking method, device, terminal and storage medium
CN117036397A (en) Multi-target tracking method based on fusion information association and camera motion compensation
CN117630860A (en) Gesture recognition method of millimeter wave radar
Lin et al. A novel robust algorithm for position and orientation detection based on cascaded deep neural network
CN112581502A (en) Target tracking method based on twin network
CN112818771A (en) Multi-target tracking algorithm based on feature aggregation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant