CN110751096A

CN110751096A - Multi-target tracking method based on KCF track confidence

Info

Publication number: CN110751096A
Application number: CN201911002819.0A
Authority: CN
Inventors: 杨红红; 吴晓军; 张玉梅; 李菁菁; 裴炤
Original assignee: Shaanxi Normal University
Current assignee: Shaanxi Normal University
Priority date: 2019-10-21
Filing date: 2019-10-21
Publication date: 2020-02-04
Anticipated expiration: 2039-10-21
Also published as: CN110751096B

Abstract

The multi-target tracking method based on the KCF track confidence coefficient comprises the following steps: s100: establishing a KCF filter, and calculating appearance similarity, shape similarity and motion similarity between a detection response in a current frame and a target tracking track by using the filter as an incidence relation model of data association; s200: correcting the detection response by using a filter; s300: calculating the track confidence coefficient of the tracked target by using a track confidence coefficient calculation method based on APCE occlusion analysis, and dividing the target tracking track into a high confidence coefficient track and a low confidence coefficient track according to the track confidence coefficient; s400: establishing a candidate target hypothesis set for representing the lost targets in the previous frame and target tracks which are not matched with the detection response; s500: and performing data correlation between adjacent frames on the corrected detection response, the high confidence track, the low confidence track and the candidate samples in the candidate target hypothesis set according to the incidence relation model.

Description

Multi-target tracking method based on KCF track confidence

Technical Field

The disclosure belongs to the field of video information processing and analysis and computer vision, and particularly relates to a multi-target tracking method based on KCF track confidence.

Background

Multi-object tracking (MOT) is a research focus in the field of computer vision, and is widely applied in the industries of video monitoring, traffic safety, automobile driving assistance systems, robot navigation positioning and the like.

The goal of multi-target tracking is to identify relevant objects in a video surveillance scene and estimate their locations in the video sequence. At present, a plurality of target tracking methods are available in a video monitoring scene, but due to occlusion, missing detection, false detection, camera shake and the like, multi-target tracking in a complex scene is still a difficult problem.

Currently, the mainstream multi-target tracking method mainly follows a tracking-by-Tracking (TBD) framework, which takes the detection response provided by the target detector as input, and connects the detection responses by association between different frames of the video sequence to obtain a final track. Thus, according to the paradigm of TBD, the whole process of MOT can be divided into two modules: the device comprises a detection module and a tracking module. In the multi-target tracking based on the TBD paradigm, detection responses are provided in advance by a target detector, and the provided detection responses have the problems of false detection and missing detection. In addition, the data association model in the tracking module has the problems of inaccurate modeling and easy generation of tracking errors. Therefore, the TBD paradigm-based multi-target tracking algorithm generally has the problems that the initial detection result has great influence on the tracking performance and the data association method causes tracking errors.

However, most existing MOT methods based on the TBD paradigm are primarily focused on the tracking module, including data correlation and model optimization. Such as Multi-Hypothesis-based Multi-target tracking (MHT), Joint Probabilistic Data Association (JPDA) based Multi-target tracking, network flow framework-based Multi-target tracking, and learning-based Multi-target tracking methods. In these multi-target tracking algorithms, the appearance, shape, and position information of the target are usually used to improve the performance of multi-target tracking and enhance the robustness of the tracking module. However, of these methods, few involve a method of taking into account and compensating for the missing detection and the false detection caused by the detector. Therefore, the above problem is one of the major drawbacks of the MOT method based on the TBD paradigm.

In recent years, a Single Object Tracking (SOT) method has achieved a significant effect in the aspect of appearance model learning. Therefore, applying the SOT tracking method to MOT tracking helps to improve the tracking performance of MOT. Many tracking methods follow this idea, directly introducing SOT into MOT in an attempt to improve MOT tracking performance. However, the samples used by the SOT in learning the appearance of the target are learned online based on the tracking results, which contain a large number of noise samples. Secondly, in a video monitoring scene, mutual occlusion among multiple targets is serious. Further, in the process of introducing the SOT to the MOT, since the SOT requires a newly appeared target to be added to the MOT tracking system in real time, directly applying the SOT to the MOT will cause a problem that the calculation cost increases exponentially as the number of tracking targets increases.

Disclosure of Invention

In view of this, the present disclosure provides a multi-target tracking method based on KCF trajectory confidence, including the following steps:

s100: establishing a filter based on KCF, and calculating appearance similarity, shape similarity and motion similarity between N detection responses in a current frame and M item mark tracking tracks by using the filter based on KCF as an incidence relation model of data association, wherein N and M are integers more than 1;

s200: correcting the N detection responses with the filter;

s300: calculating the track confidence coefficient of the tracked target by using a track confidence coefficient calculation method based on APCE occlusion analysis, and dividing the M item target tracking tracks into high confidence coefficient tracks and low confidence coefficient tracks according to the track confidence coefficient;

s400: establishing a candidate target hypothesis set for representing the lost targets in the previous frame and target tracks which are not matched with the N detection responses;

s500: and performing data association between adjacent frames on the corrected detection response, the high confidence track, the low confidence track and the candidate samples in the candidate target hypothesis set according to the association relation model.

By the technical scheme, the method can obviously reduce the calculation complexity of using SOT in MOT; the correlation filter is used for measuring the similarity between the detection response and the tracking target and further correcting the detection error problem caused by the detector; the trajectory confidence degree calculation method based on APCE occlusion analysis and establishment of candidate target hypothesis sets (CTH) further improve the performance of multi-target tracking data association. The method effectively improves the robustness and tracking precision of target tracking.

Drawings

FIG. 1 is a flow chart of a multi-target tracking method based on KCF trajectory confidence provided in an embodiment of the present disclosure;

fig. 2 is a graph illustrating the tracking performance of different components on MOT 2015 validation set in an embodiment of the disclosure.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings.

In one embodiment, referring to fig. 1, a multi-target tracking method based on KCF track confidence is disclosed, comprising the following steps:

s200: correcting the N detection responses with the filter;

For this embodiment, the method comprises: the method comprises the steps of calculating based on KCF association similarity, correcting based on KCF detection response, performing two-step data association based on APCE track confidence degree and constructing a candidate target hypothesis set based on the data association and KCF. The method comprises the steps of firstly dividing an online MOT tracking task into two parts of establishment of an association model and frame-by-frame data association based on the association model. Secondly, in the construction process of the correlation model, firstly, the method introduces the SOT based on a kernel correlation filter algorithm KCF (Kernel correlation Filter) into the MOT system to capture the context information in online MOT tracking and timely process the problems of false detection and missed detection caused by the detector. Meanwhile, in order to establish a robust correlation model, the method not only measures the similarity between the detection response and the tracking target based on the nuclear correlation filter, but also optimizes the detection response provided by the target detector through the tracking result of the KCF. In addition, in the frame-by-frame data association process, in order to improve the performance of data association, a calculation method based on the confidence of an APCE (amplitude-to-correlation energy) trajectory is introduced as an index of tracking reliability. And then dividing the target tracking track into a high confidence track and a low confidence track according to the track confidence. Meanwhile, the method establishes a candidate target hypothesis set (CTH) which comprises the missing target and the unmatched target tracks in the previous frame so as to improve the performance of data association. And finally, performing data correlation between adjacent frames on the high-confidence track, the low-confidence track, the detection response provided by the detector and the candidate samples in the CTH according to a correlation model.

In another embodiment, the S100 further includes:

s101: the established KCF-based filter is a KCF filter trained by only using target samples in a t-1 frame, wherein t represents a current frame;

s102: calculating the tracked target x in the t-1 frame_lWith the detected response z in the t-th frame_lThe appearance similarity S between_appThe method specifically comprises the following steps:

wherein the content of the first and second substances,

as an output vector y_lDiscrete Fourier transform of (y)_lFor training samples f_lDesired output of f_lHOG and CN characteristics corresponding to the candidate sample;

s103: calculating the tracked target x in the t-1 frame_lWith the detected response z in the t-th frame_lThe similarity of the shapes S between_shapeThe method specifically comprises the following steps:

s_shape＝IoU(x_l，z_l)；

s104: calculating the tracked target x in the t-1 frame_lWith the detected response z in the t-th frame_lDegree of motion similarity between S_motionThe method specifically comprises the following steps:

S_motion＝G(T_pos-Z_pos，∑)

(12)

wherein G (-) is a Gaussian function with a mean value of 0, T_posAnd Z_posAre respectively x_lAnd z_lThe position of (a).

For the embodiment, the method only uses the training sample of the previous frame to train the KCF filter, reduces the influence of error accumulation caused by error samples in the SOT tracking process on the performance of the KCF filter, and meanwhile, obviously reduces the calculation cost of using SOT in MOT.

The construction of a robust association model is an important factor influencing the multi-target tracking performance. Therefore, the method provides a KCF-based correlation model construction method.

In the TBD-based online multi-target tracking framework, one key step is to associate N detection responses in the current frame with M tracks. Suppose that in the t-th frame, N detection responses are represented as

M tracks are represented as

Represents the j-th track T^jAssociated detection response, t_sAnd t_eRepresenting a track T^jThe start and end frames of the frame are,

indicates the sum track T in the k-th frame^jAn associated detection response. Here T^jTracks are shown, however, the jth track T^jIs responsive to detection in successive frames

And (3) forming.

The similarity calculation between tracks is usually to calculate the similarity between the detection response and the tracks based on certain characteristics, such as appearance, position and shape; and then obtaining a final association model according to the product of the similarity of different features.

Establishing a KCF-based filter, wherein the idea of the KCF-based filter is as follows: and calculating the response value of the correlation filter according to the position of the tracked target in the current frame. Correlation is a measure of the similarity of two signals, the more similar the two signals are, the higher the correlation. Therefore, in the tracking application, the objective of KCF training is to design a filtering template such that the response obtained when it acts on the tracked object is maximized.

Assuming that the position of the target in the current frame is x, the training sample acquired according to the position by the cyclic shift method is x_i(W, H) e {0, …, W-1} × {0, …, H-1}, for any sample x_iCalculating the corresponding label y according to the Gaussian function_i(w，h)，y_i∈[0，1]. When the sample x_iWhen the target is at the central position, the response value is maximum y_iOn the contrary, when the sample x is 1_iWhen the target center position is far away, the response becomes small. Therefore, the objective of KCF training is to find the function f (z) ═ ω^Tz minimizes the error function, where the defined error function is:

in the formula (1), lambda is a regularization coefficient, and omega is a model parameter solution in the formula (1).

The method introduces a nonlinear mapping function by introducing a linear inseparable point in a low-dimensional space into a high-dimensional Hilbert space to perform division

Equation (1) can be written as follows:

in formula (2)

Is a kernel function of

Is used as the non-linear mapping function.

According to dual space theory, ω can be expressed as a weighted sum of the nonlinear mappings of the input samples:

therefore, the solution in equation (2) is changed from vector ω to vector α in solution dual space (α)₁…，α_n) Bringing equation (3) into equation (2) to solve the discrete fourier transform of vector α according to the property of the inner product of the mapping function to diagonalize:

in the formula (4), F represents Fourier transform, and y ═ y_i(W, H) | (W, H) ∈ {0, …, W-1} × {0, …, H-1} } is a sample label, k is a sample label^xK (x, x') is the self-kernel function of the vector x, and the method adopts a gaussian kernel function.

In the target tracking stage, for candidate image blocks Z in the t +1 frame, according to the previously trained model parameter F (α), the correlation between the candidate image blocks Z and ω, dual vector α is calculated, and the solution of fourier transform of the target response value is obtained:

in the formula (5), ⊙ represents the dot product between vectors,

is Z and target appearance

Nuclear correlations between.

And finally, determining the position of the target in the current frame according to the maximum position of the correlation response image:

wherein R is the maximum response value.

Calculating the appearance similarity based on KCF: the idea of calculating based on KCF appearance similarity in the method is to avoid the influence of noise samples on model training in the error accumulation and tracking processes, and train a KCF filter model by only using target samples in the t-1 frame.

Therefore, in multi-target tracking, for an arbitrary tracking target x in the t-1 th frame_lSampling candidate samples x of size WxH according to the cyclic shift principle_lW and H respectively represent candidate samples x_lThe width and height of (c), and their corresponding HOG (histogram of organized gradient) and CN (color names) characteristics are represented as f_lThen, the corresponding KCF filtering model applied in the MOT tracking method is:

y in formula (7)_lFor training samples f_lThe solution of equation (7) is solved according to equations (2-5).

In the multi-target tracking stage, for any detection response z in the t-th frame_lThe corresponding response diagram is calculated by the KCF filter in equation (7):

wherein the content of the first and second substances,

as an output vector y_lDiscrete fourier transform of (d).

Therefore, the maximum response value of the formula (8) is used to calculate the tracked target x in the t-1 frame in the MOT tracking process_lWith the detected response z in the t-th frame_lThe appearance similarity S between_app：

Calculating the shape similarity based on KCF:

for any object x tracked in the t-1 frame_lIn other words, the predicted position in the t-th frame is represented by the maximum response value in equation (8)

And (6) determining. Thus, the method defines a target x_lPosition p in the t-th frame is:

in the formula (10)

Is a target x_lAt the position of the t-1 th frame.

In the method, a tracked target x in a t-1 frame is defined_lWith the detected response z in the t-th frame_lThe similarity of the shapes S between_shapeThe target bounding box predicted by the KCF filter and the IOU (interaction-over-unity) value of the detection response are determined as follows:

and (3) calculating motion similarity: for arbitrary tracking target x_lAnd the detection response z_lThe motion similarity is defined as:

S_motion＝G(T_pos-Z_pos，∑)

(12)

in formula (12), G (-) is a Gaussian function with a mean value of 0, T_posAnd Z_posAre respectively the target x_lAnd the detection response z_lThe position of (a).

In another embodiment, the S200 further includes:

s201: predicting arbitrary targets in t-1 frames using the KCF-based filter

State in the tth frame:

wherein the content of the first and second substances,

to predict the position information of the target using the KCF-based filter,

and

respectively the width and height of the predicted target bounding box,

is the position of the jth track in the t-1 frame

η is a predefined threshold, CTH is a set of candidate target hypotheses;

s202, when the maximum response value of the predicted target is larger than the predefined threshold η, adding the predicted target into the predicted target set

Otherwise, adding the target sequence into a CTH candidate target hypothesis set;

s203: assume that all the predicted targets in the t-th frame that satisfy the above equation are

The detection response of the t-th frame is:

wherein M represents the number of predicted targets,

representing the detection response provided by the target detector,

indicating a detection response condition

And target predicted state

Redundancy elimination based on IoU, the judgment is based on

And

whether the value of IoU is greater than a predefined threshold α;

s204: when in use

And

when the value of IoU is greater than the predefined threshold α, it represents the same target and only retains the detection response, otherwise, the two represent different target detection responses and retain the same, and finally the detection response of the t frame after being corrected by KCF is obtained

In the embodiment, in the MOT tracking process, a KCF filter is used for capturing context information in online MOT tracking, the similarity between a detection response and a target track is measured, the detection response provided by a detector is corrected, and the problems of false detection and missed detection caused by the detector are timely processed.

For the first frame image, no detection response correction is needed, and by default the detection response in the first frame is true. And (5) from the second frame image, training a KCF filter according to the tracking result of the previous frame, then correcting, and establishing an association relation model after the correction is completed.

An Intersection Over Union (IOU) is a standard that measures the accuracy of detecting a corresponding object in a particular data set. IoU is a simple measurement criterion, and IoU can be used to measure any task that yields a predicted range (bins) in the output.

In another embodiment, the predefined threshold η is 0.7 and the predefined threshold α is 0.6.

For this embodiment, the predefined threshold is a numerical value between 0 and 1, the specific value being determined according to experimental effects.

In another embodiment, interdigitation between occlusion and target is a common phenomenon in MOT tracking. Occlusion tends to cause the response pattern of the KCF filter to fluctuate. Therefore, the method introduces an APCE (average peak-to-correlation energy) value to calculate the fluctuation degree of the response map, and the fluctuation degree is used as an occlusion index to measure the occlusion degree in MOT tracking. If the tracked object is not occluded, the APCE value will be large and the corresponding response diagram is unimodal, whereas when the tracked object is occluded, its response diagram will fluctuate dramatically and the APCE value will also decrease significantly. APCE is defined as:

in the formula

Represents the response value of the ith sample,

and

respectively, the maximum and minimum response values in the formula (7).

The APCE is normalized by utilizing the historical value of the tracking target APCE and is used as an index O for measuring the shielding degree of the target_APCE：

In the formula, t_sIs the start frame index of the tracked target track, and t is the video frame index. Obtaining the shielding degree index O of any track in the t frame through calculation_APCE。

In another embodiment, the S300 further includes:

the trajectory confidence is defined as:

l in the formula_jIs T^jLength of track of C (T)^j，zⁱ) Is a track T^jAnd the detection response zⁱCorrelation similarity value of O_APCEIn order to be able to block the degree of shading,

respectively represent the track T^jA start frame and an end frame.

In another embodiment, when conf (T)^j) If the confidence coefficient is more than 0.5, the corresponding track is a track with high confidence coefficient, otherwise, the track with low confidence coefficient is obtained.

For the embodiment, the target occlusion is measured based on KCF filtering, APCE indexes are introduced to measure the occlusion degree of the target and calculate the confidence coefficient of the track, and the confidence coefficient is used as the index of the tracking reliability. The tracking track is divided into a high confidence track and a low confidence track for data association, so that the influence of tracking errors on the data association is reduced.

Suppose there are N candidate detection responses in the t-th frame after the detection response correction

And M tracks

Wherein the content of the first and second substances,

for the location of the ith detection response,

indicating the size of its corresponding detection bounding box,

and

indicating the location and confidence of the jth trace,

the bounding box size corresponding to the jth track. The main task of online multi-target tracking is to correlate the detection response in frame t with the trajectory generated in frame t-1, thereby generating the current target trajectory in frame t.

Due to occlusion and detection response errors, trackers sometimes fail to establish a correct tracking trajectory. A longer track with high confidence is usually a more reliable track. Therefore, the method calculates the confidence of the track by introducing the shielding degree, the track length and the associated similarity, and then divides the track into high-confidence tracks T_highAnd low confidence trajectory T_lowAnd carrying out data association.

In another embodiment, the S500 further includes:

the data association between the adjacent frames executed by the high-confidence track is specifically as follows:

the incidence relation matrix is constructed as follows:

in the formula, S_ijIs a trackAnd the detection response ZⁱBased on appearance similarity S_appShape similarity S_shapeAnd motion similarity S_motionS is the correlation similarity value of S_ijForming a incidence relation matrix, and assuming that k high-confidence tracks T exist in the T-th frame_high，

Contains n detection responses, a candidate target hypothesis set

There are m detection responses;

and according to the constructed incidence relation matrix, solving the incidence matching between the high-confidence track and the detection response by a Hungarian algorithm, and then updating the confidence value of the track and the state of the track according to the incidence result.

For this embodiment, the high confidence tracks are first data-correlated, which involves T_highAnd detecting the response

The association relationship matrix is constructed based on the appearance similarity, shape similarity, and motion similarity of the trajectory and the detection response.

In another embodiment, the S500 further includes:

the data association between the adjacent frames executed by the low-confidence track is specifically as follows:

after associating high confidence tracks with detection responses, assumptions are made

There are n' unassociated detection responses in the set, and the candidate target hypothesis is present in the setm ' unassociated detection responses and q ' unassociated trajectories, k ' unassociated high-confidence trajectories T_high，

Representing candidate detection responses in the t-th frame

Assuming that l tracks with low confidence exist in the t-th frame, h unmatched tracks, h ═ q '+ k', q detection responses, and q ═ n '+ m'; then the incidence matrix in the low confidence track data association process is:

wherein A ═ a_ij]，

Represents the ith low confidence track

With jth high confidence trace

The correlation similarity value of (a); b ═ B_ij]，

Represents the ith low confidence trackWith the jth detection response c^jThe correlation similarity value of (a); d ═ diag [ D₁，…，d_l]，

For the ith low confidence track

The probability of termination; tau is a predefined threshold, X is an incidence relation matrix formed by A, B, D and tau in the low confidence coefficient track association process;

and solving the association problem of the low-confidence track by a Hungarian algorithm according to the constructed association relation matrix, and then updating the confidence value of the track and the state of the track according to the association result.

For this embodiment, there are only three associated states due to the low confidence trajectory: associated with a high confidence trace, or associated with a detection response, or terminated. Therefore, the method further detects whether the detection response and the track are associated with a track with low confidence degree.

In another embodiment, the updating of the set of candidate target hypotheses. In the multi-target tracking, in order to overcome the influence of false detection and missed detection, the method associates data in the t-th frameThe unmatched detection responses are merged into the candidate target hypothesis CTH, which is kept as a potential trace. Meanwhile, to avoid erroneous tracking, unmatched high confidence tracks and low confidence tracks are also added to the CTH set. In addition, in order to save computation time and space,

if the candidate target in (1) is not associated with any detection response or trajectory in the consecutive frames (set to 6 frames), it is subjected to discarding processing. Pass through pair

After the concentrated track and the adding and deleting work of the detection response, the CTH set of the t frame is obtained

In another embodiment, as shown in fig. 2, P1 is the MOT tracking method of the present method with the elimination of the correction part based on KCF detection response, P2 is the MOT tracking method of the present method with the elimination of the occlusion analysis part based on KCF, where IOU (interference-over-unity) values of the tracking target and the detection response replace the correlation calculation part based on KCF in the present method, P3 is the MOT tracking method of the present method with the elimination of the occlusion analysis part based on KCF, P4 is the MOT tracking method of the present method with the elimination of the candidate target hypothesis set part, and Ours is the tracking method of the present method including all steps. Table 1 shows the evaluation results of P1, P2, P3, P4 and the method based on the multi-target tracking evaluation indexes on the MOT 2015 validation set, where the evaluation indexes include: MOTP (multiple Object Tracking precision), MOTA (multiple Object Tracking Access), MT (Mostly-Tracked), ML (Mostly-Lost), FP (false Positive), FN (false negative) and IDs (ID-switch). In table 1, the higher the index of the bands (heel) is, the better the performance is; for the index with (↓) the lower the value, the better the performance.

Tracker	MOTA(％)↑	MOTP(％)↑	MT(％)↑	ML(％)↓	FP↓	FN↓	IDs↓
								P1	19.8	73.9	8.5	68.6	3309	15017	265
P2	24.7	73.8	10.3	53.4	3197	13367	134
								P3	23	73.6	9.8	56	3548	13651	147
P4	22.1	73.5	9.8	53.8	3727	14958	138
								Our	28.6	73.9	13.3	52.4	2691	13491	123

TABLE 1

As can be seen from the evaluation results of FIG. 2 and Table 1, all the components of the method are helpful for improving the tracking accuracy of the multi-target tracking method, and the tracking accuracy of the P1-P4 tracking method and MOTA indexes are lower than those of the method. The P1 tracking method has the MOTA index obviously reduced and the IDs index obviously increased due to the lack of a response correction part based on KCF detection. In the P2 tracking method, as the incidence relation calculation method based on the KCF filter is replaced, the MOTA index is reduced, and the ID index is increased, the phenomenon further illustrates that the method based on the KCF similarity calculation can better learn the appearance information of the tracking target, and the provided strategy based on the KCF detection response correction can eliminate the problems of missed detection and false detection caused by the detector to a certain extent. From the tracking performance of the P3 tracking method, the APCE-based trajectory confidence degree calculation is an important component of the method, and the lack of the component obviously reduces the tracking accuracy MOTA of the tracking algorithm, so that the APCE-based trajectory confidence degree calculation method provided in the method is helpful for improving the data association performance. In the P4 tracking method, because a candidate target hypothesis set is missing, the MOTA is significantly decreased, and the IDS is significantly increased, further indicating that the candidate target hypothesis set can handle tracking errors and detection response errors to a certain extent, which is helpful for improving the tracking accuracy. Therefore, each part proposed by the method comprises the association similarity calculation based on the KCF, the detection response correction based on the KCF, the two-step data association based on the APCE track confidence coefficient and the candidate target hypothesis set based on the data association and the KCF, which are all beneficial to improving the tracking accuracy of multi-target tracking.

Although the embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments and application fields, and the above-described embodiments are illustrative, instructive, and not restrictive. Those skilled in the art, having the benefit of this disclosure, may effect numerous modifications thereto without departing from the scope of the invention as defined by the appended claims.

Claims

1. A multi-target tracking method based on KCF track confidence coefficient comprises the following steps:

s200: correcting the N detection responses with the filter;

2. The method according to claim 1, preferably, the S100 further comprises:

wherein the content of the first and second substances,

S_shape＝IoU(x_l，z_l)；

S_motion＝G(T_pos-Z_pos，∑) (12)

3. The method of claim 1, the S200 further comprising:

s201: predicting arbitrary targets in t-1 frames using the KCF-based filter

State in the tth frame:

wherein the content of the first and second substances,

to predict the position information of the target using the KCF-based filter,

and

respectively the width and height of the predicted target bounding box,

is the position of the jth track in the t-1 frame

η is a predefined threshold, CTH is a set of candidate target hypotheses;

The detection response of the t-th frame is:

wherein M represents the number of predicted targets,

representing the detection response provided by the target detector,

indicating a detection response condition

And target predicted stateRedundancy elimination based on IoU, the judgment is based on

And

whether the value of IoU is greater than a predefined threshold α;

s204: when in useAnd

4. The method of claim 3, the predefined threshold η being 0.7.

5. The method of claim 3, the predefined threshold α being 0.6.

6. The method of claim 1, the S300 further comprising:

the trajectory confidence is defined as:

l in the formula_jIs T^jLength of track of C (T)^j，z^l) Is a track T^jAnd detecting the response ZⁱCorrelation similarity value of O_APCEIn order to be able to block the degree of shading,respectively represent the track T^jA start frame and an end frame.

7. The method of claim 6 when Conf (T)^j) If the confidence coefficient is more than 0.5, the corresponding track is a track with high confidence coefficient, otherwise, the track with low confidence coefficient is obtained.

8. The method of claim 1, the S500 further comprising:

the incidence relation matrix is constructed as follows:

in the formula, s_ijIs a track

And the detection response zⁱBased on appearance similarity S_appShape similarity S_shapeAnd motion similarity S_motionS is the correlation similarity value of S_ijForming a incidence relation matrix, and assuming that k high-confidence tracks T exist in the T-th frame_high，

Contains n detection responses, a candidate target hypothesis set

There are m detection responses;

9. The method of claim 1, the S500 further comprising:

There are n 'unassociated detection responses in the set, m' unassociated detection responses and q 'unassociated trajectories in the set of candidate target hypotheses, k' unassociated high-confidence trajectories T_high

Representing candidate detection responses in the t-th frame

wherein A ═ a_ij]，

Represents the ith low confidence track

With j-th high-confidence track

Associating the similarity values; b ═ B_ij]，

For the ith low confidence track