CN110751096A - Multi-target tracking method based on KCF track confidence - Google Patents

Multi-target tracking method based on KCF track confidence Download PDF

Info

Publication number
CN110751096A
CN110751096A CN201911002819.0A CN201911002819A CN110751096A CN 110751096 A CN110751096 A CN 110751096A CN 201911002819 A CN201911002819 A CN 201911002819A CN 110751096 A CN110751096 A CN 110751096A
Authority
CN
China
Prior art keywords
track
target
confidence
frame
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911002819.0A
Other languages
Chinese (zh)
Other versions
CN110751096B (en
Inventor
杨红红
吴晓军
张玉梅
李菁菁
裴炤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Normal University
Original Assignee
Shaanxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaanxi Normal University filed Critical Shaanxi Normal University
Priority to CN201911002819.0A priority Critical patent/CN110751096B/en
Publication of CN110751096A publication Critical patent/CN110751096A/en
Application granted granted Critical
Publication of CN110751096B publication Critical patent/CN110751096B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20056Discrete and fast Fourier transform, [DFT, FFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The multi-target tracking method based on the KCF track confidence coefficient comprises the following steps: s100: establishing a KCF filter, and calculating appearance similarity, shape similarity and motion similarity between a detection response in a current frame and a target tracking track by using the filter as an incidence relation model of data association; s200: correcting the detection response by using a filter; s300: calculating the track confidence coefficient of the tracked target by using a track confidence coefficient calculation method based on APCE occlusion analysis, and dividing the target tracking track into a high confidence coefficient track and a low confidence coefficient track according to the track confidence coefficient; s400: establishing a candidate target hypothesis set for representing the lost targets in the previous frame and target tracks which are not matched with the detection response; s500: and performing data correlation between adjacent frames on the corrected detection response, the high confidence track, the low confidence track and the candidate samples in the candidate target hypothesis set according to the incidence relation model.

Description

Multi-target tracking method based on KCF track confidence
Technical Field
The disclosure belongs to the field of video information processing and analysis and computer vision, and particularly relates to a multi-target tracking method based on KCF track confidence.
Background
Multi-object tracking (MOT) is a research focus in the field of computer vision, and is widely applied in the industries of video monitoring, traffic safety, automobile driving assistance systems, robot navigation positioning and the like.
The goal of multi-target tracking is to identify relevant objects in a video surveillance scene and estimate their locations in the video sequence. At present, a plurality of target tracking methods are available in a video monitoring scene, but due to occlusion, missing detection, false detection, camera shake and the like, multi-target tracking in a complex scene is still a difficult problem.
Currently, the mainstream multi-target tracking method mainly follows a tracking-by-Tracking (TBD) framework, which takes the detection response provided by the target detector as input, and connects the detection responses by association between different frames of the video sequence to obtain a final track. Thus, according to the paradigm of TBD, the whole process of MOT can be divided into two modules: the device comprises a detection module and a tracking module. In the multi-target tracking based on the TBD paradigm, detection responses are provided in advance by a target detector, and the provided detection responses have the problems of false detection and missing detection. In addition, the data association model in the tracking module has the problems of inaccurate modeling and easy generation of tracking errors. Therefore, the TBD paradigm-based multi-target tracking algorithm generally has the problems that the initial detection result has great influence on the tracking performance and the data association method causes tracking errors.
However, most existing MOT methods based on the TBD paradigm are primarily focused on the tracking module, including data correlation and model optimization. Such as Multi-Hypothesis-based Multi-target tracking (MHT), Joint Probabilistic Data Association (JPDA) based Multi-target tracking, network flow framework-based Multi-target tracking, and learning-based Multi-target tracking methods. In these multi-target tracking algorithms, the appearance, shape, and position information of the target are usually used to improve the performance of multi-target tracking and enhance the robustness of the tracking module. However, of these methods, few involve a method of taking into account and compensating for the missing detection and the false detection caused by the detector. Therefore, the above problem is one of the major drawbacks of the MOT method based on the TBD paradigm.
In recent years, a Single Object Tracking (SOT) method has achieved a significant effect in the aspect of appearance model learning. Therefore, applying the SOT tracking method to MOT tracking helps to improve the tracking performance of MOT. Many tracking methods follow this idea, directly introducing SOT into MOT in an attempt to improve MOT tracking performance. However, the samples used by the SOT in learning the appearance of the target are learned online based on the tracking results, which contain a large number of noise samples. Secondly, in a video monitoring scene, mutual occlusion among multiple targets is serious. Further, in the process of introducing the SOT to the MOT, since the SOT requires a newly appeared target to be added to the MOT tracking system in real time, directly applying the SOT to the MOT will cause a problem that the calculation cost increases exponentially as the number of tracking targets increases.
Disclosure of Invention
In view of this, the present disclosure provides a multi-target tracking method based on KCF trajectory confidence, including the following steps:
s100: establishing a filter based on KCF, and calculating appearance similarity, shape similarity and motion similarity between N detection responses in a current frame and M item mark tracking tracks by using the filter based on KCF as an incidence relation model of data association, wherein N and M are integers more than 1;
s200: correcting the N detection responses with the filter;
s300: calculating the track confidence coefficient of the tracked target by using a track confidence coefficient calculation method based on APCE occlusion analysis, and dividing the M item target tracking tracks into high confidence coefficient tracks and low confidence coefficient tracks according to the track confidence coefficient;
s400: establishing a candidate target hypothesis set for representing the lost targets in the previous frame and target tracks which are not matched with the N detection responses;
s500: and performing data association between adjacent frames on the corrected detection response, the high confidence track, the low confidence track and the candidate samples in the candidate target hypothesis set according to the association relation model.
By the technical scheme, the method can obviously reduce the calculation complexity of using SOT in MOT; the correlation filter is used for measuring the similarity between the detection response and the tracking target and further correcting the detection error problem caused by the detector; the trajectory confidence degree calculation method based on APCE occlusion analysis and establishment of candidate target hypothesis sets (CTH) further improve the performance of multi-target tracking data association. The method effectively improves the robustness and tracking precision of target tracking.
Drawings
FIG. 1 is a flow chart of a multi-target tracking method based on KCF trajectory confidence provided in an embodiment of the present disclosure;
fig. 2 is a graph illustrating the tracking performance of different components on MOT 2015 validation set in an embodiment of the disclosure.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
In one embodiment, referring to fig. 1, a multi-target tracking method based on KCF track confidence is disclosed, comprising the following steps:
s100: establishing a filter based on KCF, and calculating appearance similarity, shape similarity and motion similarity between N detection responses in a current frame and M item mark tracking tracks by using the filter based on KCF as an incidence relation model of data association, wherein N and M are integers more than 1;
s200: correcting the N detection responses with the filter;
s300: calculating the track confidence coefficient of the tracked target by using a track confidence coefficient calculation method based on APCE occlusion analysis, and dividing the M item target tracking tracks into high confidence coefficient tracks and low confidence coefficient tracks according to the track confidence coefficient;
s400: establishing a candidate target hypothesis set for representing the lost targets in the previous frame and target tracks which are not matched with the N detection responses;
s500: and performing data association between adjacent frames on the corrected detection response, the high confidence track, the low confidence track and the candidate samples in the candidate target hypothesis set according to the association relation model.
For this embodiment, the method comprises: the method comprises the steps of calculating based on KCF association similarity, correcting based on KCF detection response, performing two-step data association based on APCE track confidence degree and constructing a candidate target hypothesis set based on the data association and KCF. The method comprises the steps of firstly dividing an online MOT tracking task into two parts of establishment of an association model and frame-by-frame data association based on the association model. Secondly, in the construction process of the correlation model, firstly, the method introduces the SOT based on a kernel correlation filter algorithm KCF (Kernel correlation Filter) into the MOT system to capture the context information in online MOT tracking and timely process the problems of false detection and missed detection caused by the detector. Meanwhile, in order to establish a robust correlation model, the method not only measures the similarity between the detection response and the tracking target based on the nuclear correlation filter, but also optimizes the detection response provided by the target detector through the tracking result of the KCF. In addition, in the frame-by-frame data association process, in order to improve the performance of data association, a calculation method based on the confidence of an APCE (amplitude-to-correlation energy) trajectory is introduced as an index of tracking reliability. And then dividing the target tracking track into a high confidence track and a low confidence track according to the track confidence. Meanwhile, the method establishes a candidate target hypothesis set (CTH) which comprises the missing target and the unmatched target tracks in the previous frame so as to improve the performance of data association. And finally, performing data correlation between adjacent frames on the high-confidence track, the low-confidence track, the detection response provided by the detector and the candidate samples in the CTH according to a correlation model.
In another embodiment, the S100 further includes:
s101: the established KCF-based filter is a KCF filter trained by only using target samples in a t-1 frame, wherein t represents a current frame;
s102: calculating the tracked target x in the t-1 framelWith the detected response z in the t-th framelThe appearance similarity S betweenappThe method specifically comprises the following steps:
Figure BDA0002241452320000061
wherein the content of the first and second substances,
Figure BDA0002241452320000062
as an output vector ylDiscrete Fourier transform of (y)lFor training samples flDesired output of flHOG and CN characteristics corresponding to the candidate sample;
s103: calculating the tracked target x in the t-1 framelWith the detected response z in the t-th framelThe similarity of the shapes S betweenshapeThe method specifically comprises the following steps:
sshape=IoU(xl,zl);
s104: calculating the tracked target x in the t-1 framelWith the detected response z in the t-th framelDegree of motion similarity between SmotionThe method specifically comprises the following steps:
Smotion=G(Tpos-Zpos,∑)
(12)
wherein G (-) is a Gaussian function with a mean value of 0, TposAnd ZposAre respectively xlAnd zlThe position of (a).
For the embodiment, the method only uses the training sample of the previous frame to train the KCF filter, reduces the influence of error accumulation caused by error samples in the SOT tracking process on the performance of the KCF filter, and meanwhile, obviously reduces the calculation cost of using SOT in MOT.
The construction of a robust association model is an important factor influencing the multi-target tracking performance. Therefore, the method provides a KCF-based correlation model construction method.
In the TBD-based online multi-target tracking framework, one key step is to associate N detection responses in the current frame with M tracks. Suppose that in the t-th frame, N detection responses are represented as
Figure BDA0002241452320000071
M tracks are represented as
Figure BDA0002241452320000072
Figure BDA0002241452320000073
Represents the j-th track TjAssociated detection response, tsAnd teRepresenting a track TjThe start and end frames of the frame are,
Figure BDA0002241452320000074
indicates the sum track T in the k-th framejAn associated detection response. Here TjTracks are shown, however, the jth track TjIs responsive to detection in successive frames
Figure BDA0002241452320000075
And (3) forming.
The similarity calculation between tracks is usually to calculate the similarity between the detection response and the tracks based on certain characteristics, such as appearance, position and shape; and then obtaining a final association model according to the product of the similarity of different features.
Establishing a KCF-based filter, wherein the idea of the KCF-based filter is as follows: and calculating the response value of the correlation filter according to the position of the tracked target in the current frame. Correlation is a measure of the similarity of two signals, the more similar the two signals are, the higher the correlation. Therefore, in the tracking application, the objective of KCF training is to design a filtering template such that the response obtained when it acts on the tracked object is maximized.
Assuming that the position of the target in the current frame is x, the training sample acquired according to the position by the cyclic shift method is xi(W, H) e {0, …, W-1} × {0, …, H-1}, for any sample xiCalculating the corresponding label y according to the Gaussian functioni(w,h),yi∈[0,1]. When the sample xiWhen the target is at the central position, the response value is maximum yiOn the contrary, when the sample x is 1iWhen the target center position is far away, the response becomes small. Therefore, the objective of KCF training is to find the function f (z) ═ ωTz minimizes the error function, where the defined error function is:
Figure 1
in the formula (1), lambda is a regularization coefficient, and omega is a model parameter solution in the formula (1).
The method introduces a nonlinear mapping function by introducing a linear inseparable point in a low-dimensional space into a high-dimensional Hilbert space to perform division
Figure BDA0002241452320000082
Equation (1) can be written as follows:
Figure 2
in formula (2)
Figure BDA0002241452320000084
Is a kernel function of
Figure BDA0002241452320000085
Is used as the non-linear mapping function.
According to dual space theory, ω can be expressed as a weighted sum of the nonlinear mappings of the input samples:
Figure BDA0002241452320000091
therefore, the solution in equation (2) is changed from vector ω to vector α in solution dual space (α)1…,αn) Bringing equation (3) into equation (2) to solve the discrete fourier transform of vector α according to the property of the inner product of the mapping function to diagonalize:
Figure BDA0002241452320000092
in the formula (4), F represents Fourier transform, and y ═ yi(W, H) | (W, H) ∈ {0, …, W-1} × {0, …, H-1} } is a sample label, k is a sample labelxK (x, x') is the self-kernel function of the vector x, and the method adopts a gaussian kernel function.
In the target tracking stage, for candidate image blocks Z in the t +1 frame, according to the previously trained model parameter F (α), the correlation between the candidate image blocks Z and ω, dual vector α is calculated, and the solution of fourier transform of the target response value is obtained:
Figure BDA0002241452320000093
in the formula (5), ⊙ represents the dot product between vectors,
Figure BDA0002241452320000094
is Z and target appearance
Figure BDA0002241452320000101
Nuclear correlations between.
And finally, determining the position of the target in the current frame according to the maximum position of the correlation response image:
Figure BDA0002241452320000102
wherein R is the maximum response value.
Calculating the appearance similarity based on KCF: the idea of calculating based on KCF appearance similarity in the method is to avoid the influence of noise samples on model training in the error accumulation and tracking processes, and train a KCF filter model by only using target samples in the t-1 frame.
Therefore, in multi-target tracking, for an arbitrary tracking target x in the t-1 th framelSampling candidate samples x of size WxH according to the cyclic shift principlelW and H respectively represent candidate samples xlThe width and height of (c), and their corresponding HOG (histogram of organized gradient) and CN (color names) characteristics are represented as flThen, the corresponding KCF filtering model applied in the MOT tracking method is:
Figure BDA0002241452320000103
y in formula (7)lFor training samples flThe solution of equation (7) is solved according to equations (2-5).
In the multi-target tracking stage, for any detection response z in the t-th framelThe corresponding response diagram is calculated by the KCF filter in equation (7):
Figure BDA0002241452320000111
wherein the content of the first and second substances,
Figure BDA0002241452320000113
as an output vector ylDiscrete fourier transform of (d).
Therefore, the maximum response value of the formula (8) is used to calculate the tracked target x in the t-1 frame in the MOT tracking processlWith the detected response z in the t-th framelThe appearance similarity S betweenapp
Figure BDA0002241452320000114
Calculating the shape similarity based on KCF:
for any object x tracked in the t-1 framelIn other words, the predicted position in the t-th frame is represented by the maximum response value in equation (8)
Figure BDA0002241452320000115
And (6) determining. Thus, the method defines a target xlPosition p in the t-th frame is:
Figure BDA0002241452320000121
in the formula (10)
Figure BDA0002241452320000123
Is a target xlAt the position of the t-1 th frame.
In the method, a tracked target x in a t-1 frame is definedlWith the detected response z in the t-th framelThe similarity of the shapes S betweenshapeThe target bounding box predicted by the KCF filter and the IOU (interaction-over-unity) value of the detection response are determined as follows:
Figure BDA0002241452320000122
and (3) calculating motion similarity: for arbitrary tracking target xlAnd the detection response zlThe motion similarity is defined as:
Smotion=G(Tpos-Zpos,∑)
(12)
in formula (12), G (-) is a Gaussian function with a mean value of 0, TposAnd ZposAre respectively the target xlAnd the detection response zlThe position of (a).
In another embodiment, the S200 further includes:
s201: predicting arbitrary targets in t-1 frames using the KCF-based filter
Figure BDA0002241452320000131
State in the tth frame:
Figure BDA0002241452320000132
wherein the content of the first and second substances,
Figure BDA0002241452320000133
to predict the position information of the target using the KCF-based filter,
Figure BDA0002241452320000134
and
Figure BDA0002241452320000135
respectively the width and height of the predicted target bounding box,
Figure BDA0002241452320000136
is the position of the jth track in the t-1 frame
Figure BDA0002241452320000137
η is a predefined threshold, CTH is a set of candidate target hypotheses;
s202, when the maximum response value of the predicted target is larger than the predefined threshold η, adding the predicted target into the predicted target set
Figure BDA0002241452320000138
Otherwise, adding the target sequence into a CTH candidate target hypothesis set;
s203: assume that all the predicted targets in the t-th frame that satisfy the above equation are
Figure BDA0002241452320000139
The detection response of the t-th frame is:
Figure BDA00022414523200001310
wherein M represents the number of predicted targets,
Figure BDA00022414523200001311
representing the detection response provided by the target detector,
Figure BDA00022414523200001312
indicating a detection response condition
Figure BDA00022414523200001313
And target predicted state
Figure BDA00022414523200001314
Redundancy elimination based on IoU, the judgment is based on
Figure BDA00022414523200001315
And
Figure BDA00022414523200001316
whether the value of IoU is greater than a predefined threshold α;
s204: when in use
Figure BDA0002241452320000141
And
Figure BDA0002241452320000142
when the value of IoU is greater than the predefined threshold α, it represents the same target and only retains the detection response, otherwise, the two represent different target detection responses and retain the same, and finally the detection response of the t frame after being corrected by KCF is obtained
Figure BDA0002241452320000143
In the embodiment, in the MOT tracking process, a KCF filter is used for capturing context information in online MOT tracking, the similarity between a detection response and a target track is measured, the detection response provided by a detector is corrected, and the problems of false detection and missed detection caused by the detector are timely processed.
For the first frame image, no detection response correction is needed, and by default the detection response in the first frame is true. And (5) from the second frame image, training a KCF filter according to the tracking result of the previous frame, then correcting, and establishing an association relation model after the correction is completed.
An Intersection Over Union (IOU) is a standard that measures the accuracy of detecting a corresponding object in a particular data set. IoU is a simple measurement criterion, and IoU can be used to measure any task that yields a predicted range (bins) in the output.
In another embodiment, the predefined threshold η is 0.7 and the predefined threshold α is 0.6.
For this embodiment, the predefined threshold is a numerical value between 0 and 1, the specific value being determined according to experimental effects.
In another embodiment, interdigitation between occlusion and target is a common phenomenon in MOT tracking. Occlusion tends to cause the response pattern of the KCF filter to fluctuate. Therefore, the method introduces an APCE (average peak-to-correlation energy) value to calculate the fluctuation degree of the response map, and the fluctuation degree is used as an occlusion index to measure the occlusion degree in MOT tracking. If the tracked object is not occluded, the APCE value will be large and the corresponding response diagram is unimodal, whereas when the tracked object is occluded, its response diagram will fluctuate dramatically and the APCE value will also decrease significantly. APCE is defined as:
Figure BDA0002241452320000151
in the formula
Figure BDA0002241452320000155
Represents the response value of the ith sample,
Figure BDA0002241452320000152
and
Figure BDA0002241452320000153
respectively, the maximum and minimum response values in the formula (7).
The APCE is normalized by utilizing the historical value of the tracking target APCE and is used as an index O for measuring the shielding degree of the targetAPCE
Figure BDA0002241452320000154
In the formula, tsIs the start frame index of the tracked target track, and t is the video frame index. Obtaining the shielding degree index O of any track in the t frame through calculationAPCE
In another embodiment, the S300 further includes:
the trajectory confidence is defined as:
Figure BDA0002241452320000161
l in the formulajIs TjLength of track of C (T)j,zi) Is a track TjAnd the detection response ziCorrelation similarity value of OAPCEIn order to be able to block the degree of shading,
Figure BDA0002241452320000162
respectively represent the track TjA start frame and an end frame.
In another embodiment, when conf (T)j) If the confidence coefficient is more than 0.5, the corresponding track is a track with high confidence coefficient, otherwise, the track with low confidence coefficient is obtained.
For the embodiment, the target occlusion is measured based on KCF filtering, APCE indexes are introduced to measure the occlusion degree of the target and calculate the confidence coefficient of the track, and the confidence coefficient is used as the index of the tracking reliability. The tracking track is divided into a high confidence track and a low confidence track for data association, so that the influence of tracking errors on the data association is reduced.
Suppose there are N candidate detection responses in the t-th frame after the detection response correction
Figure BDA0002241452320000163
And M tracks
Figure BDA0002241452320000164
Wherein the content of the first and second substances,
Figure BDA0002241452320000165
Figure BDA0002241452320000171
Figure BDA0002241452320000172
for the location of the ith detection response,
Figure BDA0002241452320000173
indicating the size of its corresponding detection bounding box,
Figure BDA0002241452320000174
and
Figure BDA0002241452320000175
indicating the location and confidence of the jth trace,
Figure BDA0002241452320000176
the bounding box size corresponding to the jth track. The main task of online multi-target tracking is to correlate the detection response in frame t with the trajectory generated in frame t-1, thereby generating the current target trajectory in frame t.
Due to occlusion and detection response errors, trackers sometimes fail to establish a correct tracking trajectory. A longer track with high confidence is usually a more reliable track. Therefore, the method calculates the confidence of the track by introducing the shielding degree, the track length and the associated similarity, and then divides the track into high-confidence tracks ThighAnd low confidence trajectory TlowAnd carrying out data association.
In another embodiment, the S500 further includes:
the data association between the adjacent frames executed by the high-confidence track is specifically as follows:
the incidence relation matrix is constructed as follows:
Figure BDA0002241452320000178
in the formula, SijIs a trackAnd the detection response ZiBased on appearance similarity SappShape similarity SshapeAnd motion similarity SmotionS is the correlation similarity value of SijForming a incidence relation matrix, and assuming that k high-confidence tracks T exist in the T-th framehigh
Figure BDA0002241452320000182
Contains n detection responses, a candidate target hypothesis set
Figure BDA0002241452320000183
There are m detection responses;
and according to the constructed incidence relation matrix, solving the incidence matching between the high-confidence track and the detection response by a Hungarian algorithm, and then updating the confidence value of the track and the state of the track according to the incidence result.
For this embodiment, the high confidence tracks are first data-correlated, which involves ThighAnd detecting the response
Figure BDA0002241452320000184
The association relationship matrix is constructed based on the appearance similarity, shape similarity, and motion similarity of the trajectory and the detection response.
In another embodiment, the S500 further includes:
the data association between the adjacent frames executed by the low-confidence track is specifically as follows:
after associating high confidence tracks with detection responses, assumptions are made
Figure BDA0002241452320000185
There are n' unassociated detection responses in the set, and the candidate target hypothesis is present in the setm ' unassociated detection responses and q ' unassociated trajectories, k ' unassociated high-confidence trajectories Thigh
Figure BDA0002241452320000191
Representing candidate detection responses in the t-th frame
Figure BDA0002241452320000192
Assuming that l tracks with low confidence exist in the t-th frame, h unmatched tracks, h ═ q '+ k', q detection responses, and q ═ n '+ m'; then the incidence matrix in the low confidence track data association process is:
Figure BDA0002241452320000193
wherein A ═ aij],
Figure BDA0002241452320000194
Represents the ith low confidence track
Figure BDA0002241452320000195
With jth high confidence trace
Figure BDA0002241452320000196
The correlation similarity value of (a); b ═ Bij],
Represents the ith low confidence trackWith the jth detection response cjThe correlation similarity value of (a); d ═ diag [ D1,…,dl],
Figure BDA0002241452320000199
For the ith low confidence track
Figure BDA00022414523200001910
The probability of termination; tau is a predefined threshold, X is an incidence relation matrix formed by A, B, D and tau in the low confidence coefficient track association process;
and solving the association problem of the low-confidence track by a Hungarian algorithm according to the constructed association relation matrix, and then updating the confidence value of the track and the state of the track according to the association result.
For this embodiment, there are only three associated states due to the low confidence trajectory: associated with a high confidence trace, or associated with a detection response, or terminated. Therefore, the method further detects whether the detection response and the track are associated with a track with low confidence degree.
In another embodiment, the updating of the set of candidate target hypotheses. In the multi-target tracking, in order to overcome the influence of false detection and missed detection, the method associates data in the t-th frameThe unmatched detection responses are merged into the candidate target hypothesis CTH, which is kept as a potential trace. Meanwhile, to avoid erroneous tracking, unmatched high confidence tracks and low confidence tracks are also added to the CTH set. In addition, in order to save computation time and space,
Figure BDA0002241452320000202
if the candidate target in (1) is not associated with any detection response or trajectory in the consecutive frames (set to 6 frames), it is subjected to discarding processing. Pass through pair
Figure BDA0002241452320000203
After the concentrated track and the adding and deleting work of the detection response, the CTH set of the t frame is obtained
In another embodiment, as shown in fig. 2, P1 is the MOT tracking method of the present method with the elimination of the correction part based on KCF detection response, P2 is the MOT tracking method of the present method with the elimination of the occlusion analysis part based on KCF, where IOU (interference-over-unity) values of the tracking target and the detection response replace the correlation calculation part based on KCF in the present method, P3 is the MOT tracking method of the present method with the elimination of the occlusion analysis part based on KCF, P4 is the MOT tracking method of the present method with the elimination of the candidate target hypothesis set part, and Ours is the tracking method of the present method including all steps. Table 1 shows the evaluation results of P1, P2, P3, P4 and the method based on the multi-target tracking evaluation indexes on the MOT 2015 validation set, where the evaluation indexes include: MOTP (multiple Object Tracking precision), MOTA (multiple Object Tracking Access), MT (Mostly-Tracked), ML (Mostly-Lost), FP (false Positive), FN (false negative) and IDs (ID-switch). In table 1, the higher the index of the bands (heel) is, the better the performance is; for the index with (↓) the lower the value, the better the performance.
Tracker MOTA(%)↑ MOTP(%)↑ MT(%)↑ ML(%)↓ FP↓ FN↓ IDs↓
P1 19.8 73.9 8.5 68.6 3309 15017 265
P2 24.7 73.8 10.3 53.4 3197 13367 134
P3 23 73.6 9.8 56 3548 13651 147
P4 22.1 73.5 9.8 53.8 3727 14958 138
Our 28.6 73.9 13.3 52.4 2691 13491 123
TABLE 1
As can be seen from the evaluation results of FIG. 2 and Table 1, all the components of the method are helpful for improving the tracking accuracy of the multi-target tracking method, and the tracking accuracy of the P1-P4 tracking method and MOTA indexes are lower than those of the method. The P1 tracking method has the MOTA index obviously reduced and the IDs index obviously increased due to the lack of a response correction part based on KCF detection. In the P2 tracking method, as the incidence relation calculation method based on the KCF filter is replaced, the MOTA index is reduced, and the ID index is increased, the phenomenon further illustrates that the method based on the KCF similarity calculation can better learn the appearance information of the tracking target, and the provided strategy based on the KCF detection response correction can eliminate the problems of missed detection and false detection caused by the detector to a certain extent. From the tracking performance of the P3 tracking method, the APCE-based trajectory confidence degree calculation is an important component of the method, and the lack of the component obviously reduces the tracking accuracy MOTA of the tracking algorithm, so that the APCE-based trajectory confidence degree calculation method provided in the method is helpful for improving the data association performance. In the P4 tracking method, because a candidate target hypothesis set is missing, the MOTA is significantly decreased, and the IDS is significantly increased, further indicating that the candidate target hypothesis set can handle tracking errors and detection response errors to a certain extent, which is helpful for improving the tracking accuracy. Therefore, each part proposed by the method comprises the association similarity calculation based on the KCF, the detection response correction based on the KCF, the two-step data association based on the APCE track confidence coefficient and the candidate target hypothesis set based on the data association and the KCF, which are all beneficial to improving the tracking accuracy of multi-target tracking.
Although the embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments and application fields, and the above-described embodiments are illustrative, instructive, and not restrictive. Those skilled in the art, having the benefit of this disclosure, may effect numerous modifications thereto without departing from the scope of the invention as defined by the appended claims.

Claims (9)

1. A multi-target tracking method based on KCF track confidence coefficient comprises the following steps:
s100: establishing a filter based on KCF, and calculating appearance similarity, shape similarity and motion similarity between N detection responses in a current frame and M item mark tracking tracks by using the filter based on KCF as an incidence relation model of data association, wherein N and M are integers more than 1;
s200: correcting the N detection responses with the filter;
s300: calculating the track confidence coefficient of the tracked target by using a track confidence coefficient calculation method based on APCE occlusion analysis, and dividing the M item target tracking tracks into high confidence coefficient tracks and low confidence coefficient tracks according to the track confidence coefficient;
s400: establishing a candidate target hypothesis set for representing the lost targets in the previous frame and target tracks which are not matched with the N detection responses;
s500: and performing data association between adjacent frames on the corrected detection response, the high confidence track, the low confidence track and the candidate samples in the candidate target hypothesis set according to the association relation model.
2. The method according to claim 1, preferably, the S100 further comprises:
s101: the established KCF-based filter is a KCF filter trained by only using target samples in a t-1 frame, wherein t represents a current frame;
s102: calculating the tracked target x in the t-1 framelWith the detected response z in the t-th framelThe appearance similarity S betweenappThe method specifically comprises the following steps:
wherein the content of the first and second substances,
Figure FDA0002241452310000021
as an output vector ylDiscrete Fourier transform of (y)lFor training samples flDesired output of flHOG and CN characteristics corresponding to the candidate sample;
s103: calculating the tracked target x in the t-1 framelWith the detected response z in the t-th framelThe similarity of the shapes S betweenshapeThe method specifically comprises the following steps:
Sshape=IoU(xl,zl);
s104: calculating the tracked target x in the t-1 framelWith the detected response z in the t-th framelDegree of motion similarity between SmotionThe method specifically comprises the following steps:
Smotion=G(Tpos-Zpos,∑) (12)
wherein G (-) is a Gaussian function with a mean value of 0, TposAnd ZposAre respectively xlAnd zlThe position of (a).
3. The method of claim 1, the S200 further comprising:
s201: predicting arbitrary targets in t-1 frames using the KCF-based filter
Figure FDA0002241452310000022
State in the tth frame:
Figure FDA0002241452310000023
wherein the content of the first and second substances,
Figure FDA0002241452310000031
to predict the position information of the target using the KCF-based filter,
Figure FDA0002241452310000032
and
Figure FDA0002241452310000033
respectively the width and height of the predicted target bounding box,
Figure FDA0002241452310000034
is the position of the jth track in the t-1 frame
Figure FDA0002241452310000035
η is a predefined threshold, CTH is a set of candidate target hypotheses;
s202, when the maximum response value of the predicted target is larger than the predefined threshold η, adding the predicted target into the predicted target set
Figure FDA0002241452310000036
Otherwise, adding the target sequence into a CTH candidate target hypothesis set;
s203: assume that all the predicted targets in the t-th frame that satisfy the above equation are
Figure FDA0002241452310000037
The detection response of the t-th frame is:
wherein M represents the number of predicted targets,
Figure FDA0002241452310000039
representing the detection response provided by the target detector,
Figure FDA00022414523100000310
indicating a detection response condition
Figure FDA00022414523100000311
And target predicted stateRedundancy elimination based on IoU, the judgment is based on
Figure FDA00022414523100000313
And
Figure FDA00022414523100000314
whether the value of IoU is greater than a predefined threshold α;
s204: when in useAnd
Figure FDA0002241452310000042
when the value of IoU is greater than the predefined threshold α, it represents the same target and only retains the detection response, otherwise, the two represent different target detection responses and retain the same, and finally the detection response of the t frame after being corrected by KCF is obtained
Figure FDA0002241452310000043
4. The method of claim 3, the predefined threshold η being 0.7.
5. The method of claim 3, the predefined threshold α being 0.6.
6. The method of claim 1, the S300 further comprising:
the trajectory confidence is defined as:
Figure FDA0002241452310000044
l in the formulajIs TjLength of track of C (T)j,zl) Is a track TjAnd detecting the response ZiCorrelation similarity value of OAPCEIn order to be able to block the degree of shading,respectively represent the track TjA start frame and an end frame.
7. The method of claim 6 when Conf (T)j) If the confidence coefficient is more than 0.5, the corresponding track is a track with high confidence coefficient, otherwise, the track with low confidence coefficient is obtained.
8. The method of claim 1, the S500 further comprising:
the data association between the adjacent frames executed by the high-confidence track is specifically as follows:
the incidence relation matrix is constructed as follows:
Figure FDA0002241452310000051
in the formula, sijIs a track
Figure FDA0002241452310000053
And the detection response ziBased on appearance similarity SappShape similarity SshapeAnd motion similarity SmotionS is the correlation similarity value of SijForming a incidence relation matrix, and assuming that k high-confidence tracks T exist in the T-th framehigh
Figure FDA0002241452310000054
Contains n detection responses, a candidate target hypothesis set
Figure FDA0002241452310000055
There are m detection responses;
and according to the constructed incidence relation matrix, solving the incidence matching between the high-confidence track and the detection response by a Hungarian algorithm, and then updating the confidence value of the track and the state of the track according to the incidence result.
9. The method of claim 1, the S500 further comprising:
the data association between the adjacent frames executed by the low-confidence track is specifically as follows:
after associating high confidence tracks with detection responses, assumptions are made
Figure FDA0002241452310000056
There are n 'unassociated detection responses in the set, m' unassociated detection responses and q 'unassociated trajectories in the set of candidate target hypotheses, k' unassociated high-confidence trajectories Thigh
Figure FDA0002241452310000057
Representing candidate detection responses in the t-th frame
Assuming that l tracks with low confidence exist in the t-th frame, h unmatched tracks, h ═ q '+ k', q detection responses, and q ═ n '+ m'; then the incidence matrix in the low confidence track data association process is:
Figure FDA0002241452310000061
wherein A ═ aij],
Figure FDA0002241452310000062
Represents the ith low confidence track
Figure FDA0002241452310000063
With j-th high-confidence track
Figure FDA0002241452310000064
Associating the similarity values; b ═ Bij],
Figure FDA0002241452310000065
Represents the ith low confidence trackWith the jth detection response cjThe correlation similarity value of (a); d ═ diag [ D1,…,dl],
Figure FDA0002241452310000067
For the ith low confidence track
Figure FDA0002241452310000068
The probability of termination; tau is a predefined threshold, X is an incidence relation matrix formed by A, B, D and tau in the low confidence coefficient track association process;
and solving the association problem of the low-confidence track by a Hungarian algorithm according to the constructed association relation matrix, and then updating the confidence value of the track and the state of the track according to the association result.
CN201911002819.0A 2019-10-21 2019-10-21 Multi-target tracking method based on KCF track confidence Active CN110751096B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911002819.0A CN110751096B (en) 2019-10-21 2019-10-21 Multi-target tracking method based on KCF track confidence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911002819.0A CN110751096B (en) 2019-10-21 2019-10-21 Multi-target tracking method based on KCF track confidence

Publications (2)

Publication Number Publication Date
CN110751096A true CN110751096A (en) 2020-02-04
CN110751096B CN110751096B (en) 2022-02-22

Family

ID=69279244

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911002819.0A Active CN110751096B (en) 2019-10-21 2019-10-21 Multi-target tracking method based on KCF track confidence

Country Status (1)

Country Link
CN (1) CN110751096B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111242985A (en) * 2020-02-14 2020-06-05 电子科技大学 Video multi-pedestrian tracking method based on Markov model
CN111292355A (en) * 2020-02-12 2020-06-16 江南大学 Nuclear correlation filtering multi-target tracking method fusing motion information
CN111652150A (en) * 2020-06-04 2020-09-11 北京环境特性研究所 Infrared anti-interference tracking method
CN111914625A (en) * 2020-06-18 2020-11-10 西安交通大学 Multi-target vehicle tracking device based on data association of detector and tracker
CN111968153A (en) * 2020-07-16 2020-11-20 新疆大学 Long-time target tracking method and system based on correlation filtering and particle filtering
CN112561963A (en) * 2020-12-18 2021-03-26 北京百度网讯科技有限公司 Target tracking method and device, road side equipment and storage medium
CN112639872A (en) * 2020-04-24 2021-04-09 华为技术有限公司 Method and device for difficult mining in target detection

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105809714A (en) * 2016-03-07 2016-07-27 广东顺德中山大学卡内基梅隆大学国际联合研究院 Track confidence coefficient based multi-object tracking method
CN107341820A (en) * 2017-07-03 2017-11-10 郑州轻工业学院 A kind of fusion Cuckoo search and KCF mutation movement method for tracking target
CN107527356A (en) * 2017-07-21 2017-12-29 华南农业大学 A kind of video tracing method based on lazy interactive mode
CN108921873A (en) * 2018-05-29 2018-11-30 福州大学 The online multi-object tracking method of Markovian decision of filtering optimization is closed based on nuclear phase
CN109934849A (en) * 2019-03-08 2019-06-25 西北工业大学 Online multi-object tracking method based on track metric learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105809714A (en) * 2016-03-07 2016-07-27 广东顺德中山大学卡内基梅隆大学国际联合研究院 Track confidence coefficient based multi-object tracking method
CN107341820A (en) * 2017-07-03 2017-11-10 郑州轻工业学院 A kind of fusion Cuckoo search and KCF mutation movement method for tracking target
CN107527356A (en) * 2017-07-21 2017-12-29 华南农业大学 A kind of video tracing method based on lazy interactive mode
CN108921873A (en) * 2018-05-29 2018-11-30 福州大学 The online multi-object tracking method of Markovian decision of filtering optimization is closed based on nuclear phase
CN109934849A (en) * 2019-03-08 2019-06-25 西北工业大学 Online multi-object tracking method based on track metric learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
JIAN WEI ET AL.: ""Learning Spatio-Temporal Information for Multi-Object Tracking"", 《IEEE》 *
PENG CHU ET AL.: ""Online Multi-Object Tracking with Instance-Aware Tracker and Dynamic Model Refreshment"", 《ARXIV》 *
周海英 等: ""基于核相关滤波器的多目标跟踪算法"", 《激光与光电子学进展》 *
罗招材: ""基于视频的行人检测与跟踪算法研究"", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111292355A (en) * 2020-02-12 2020-06-16 江南大学 Nuclear correlation filtering multi-target tracking method fusing motion information
CN111242985A (en) * 2020-02-14 2020-06-05 电子科技大学 Video multi-pedestrian tracking method based on Markov model
CN111242985B (en) * 2020-02-14 2022-05-10 电子科技大学 Video multi-pedestrian tracking method based on Markov model
CN112639872A (en) * 2020-04-24 2021-04-09 华为技术有限公司 Method and device for difficult mining in target detection
CN112639872B (en) * 2020-04-24 2022-02-11 华为技术有限公司 Method and device for difficult mining in target detection
CN111652150A (en) * 2020-06-04 2020-09-11 北京环境特性研究所 Infrared anti-interference tracking method
CN111652150B (en) * 2020-06-04 2024-03-19 北京环境特性研究所 Infrared anti-interference tracking method
CN111914625A (en) * 2020-06-18 2020-11-10 西安交通大学 Multi-target vehicle tracking device based on data association of detector and tracker
CN111914625B (en) * 2020-06-18 2023-09-19 西安交通大学 Multi-target vehicle tracking device based on detector and tracker data association
CN111968153A (en) * 2020-07-16 2020-11-20 新疆大学 Long-time target tracking method and system based on correlation filtering and particle filtering
CN112561963A (en) * 2020-12-18 2021-03-26 北京百度网讯科技有限公司 Target tracking method and device, road side equipment and storage medium

Also Published As

Publication number Publication date
CN110751096B (en) 2022-02-22

Similar Documents

Publication Publication Date Title
CN110751096B (en) Multi-target tracking method based on KCF track confidence
CN108921873B (en) Markov decision-making online multi-target tracking method based on kernel correlation filtering optimization
CN107516321B (en) Video multi-target tracking method and device
CN114972418B (en) Maneuvering multi-target tracking method based on combination of kernel adaptive filtering and YOLOX detection
CN110276783B (en) Multi-target tracking method and device and computer system
CN110782483B (en) Multi-view multi-target tracking method and system based on distributed camera network
CN109118473B (en) Angular point detection method based on neural network, storage medium and image processing system
CN109146912B (en) Visual target tracking method based on target analysis
CN111127513A (en) Multi-target tracking method
CN112883819A (en) Multi-target tracking method, device, system and computer readable storage medium
CN110288627B (en) Online multi-target tracking method based on deep learning and data association
CN111553950B (en) Steel coil centering judgment method, system, medium and electronic terminal
CN112052802B (en) Machine vision-based front vehicle behavior recognition method
CN113192105B (en) Method and device for indoor multi-person tracking and attitude measurement
CN113052873B (en) Single-target tracking method for on-line self-supervision learning scene adaptation
CN111950396A (en) Instrument reading neural network identification method
CN111429485B (en) Cross-modal filtering tracking method based on self-adaptive regularization and high-reliability updating
CN115063454A (en) Multi-target tracking matching method, device, terminal and storage medium
CN117036397A (en) Multi-target tracking method based on fusion information association and camera motion compensation
CN115546705A (en) Target identification method, terminal device and storage medium
CN111462180A (en) Object tracking method based on AND-OR graph AOG
Lin et al. A novel robust algorithm for position and orientation detection based on cascaded deep neural network
CN117630860A (en) Gesture recognition method of millimeter wave radar
CN112581502A (en) Target tracking method based on twin network
CN112307897A (en) Pet tracking method based on local feature recognition and adjacent frame matching in community monitoring scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant