CN114913198A

CN114913198A - Multi-target tracking method and device, storage medium and terminal

Info

Publication number: CN114913198A
Application number: CN202110127063.3A
Authority: CN
Inventors: 张新钰; 田港林; 郭世纯; 毕铭泽; 刘华平; 吴新刚
Original assignee: Tsinghua University; Toyota Motor Corp
Current assignee: Tsinghua University; Toyota Motor Corp
Priority date: 2021-01-29
Filing date: 2021-01-29
Publication date: 2022-08-16

Abstract

A multi-target tracking method and device, a storage medium and a terminal are provided, the method comprises the following steps: during the tracking of a plurality of targets, acquiring a tracking result of a first target in the plurality of targets at the current moment; if the tracking result is unmatched tracking, determining a cluster to which the first target belongs according to the historical detection results of the targets, wherein the targets belonging to the same cluster have the same object attribute and similar motion attribute; and recovering the track of the first target at the current moment according to the detection result of the targets except the first target in the cluster at the current moment and the relative motion relationship among the targets in the cluster. The method and the device can predict and track the track of the shielded object, and are favorable for ensuring the integrity of target tracking.

Description

Multi-target tracking method and device, storage medium and terminal

Technical Field

The invention relates to the technical field of multi-target tracking, in particular to a multi-target tracking method and device, a storage medium and a terminal.

Background

The purpose of Multiple Object Tracking (MOT) is to estimate the trajectory of an object from the identity in a sequence of images or video. Recently, due to the progress of target Detection technology, a plurality of Detection-based Tracking technologies (Tracking-By-Detection, abbreviated as TBD) have enabled rapid development of multi-target Tracking. In detection-based tracking techniques, a target object is first detected, and then a tracking algorithm estimates a target trajectory through a data association algorithm using the detection result.

At the present stage, a plurality of methods have good effect on multi-target tracking, but most methods continuously improve the target detector, so that the appearance characteristics of the target extracted by the target detector are more accurate, and further, the targets which are detected by mistake and missed are reduced.

The prior art is rarely involved in the track prediction and the recovery of the occluded target. In most of the existing tracking algorithms, due to poor appearance characteristics, trajectory prediction and tracking cannot be performed on a shielded object, so that the motion trajectory of the object cannot be completely tracked.

Disclosure of Invention

The invention solves the technical problem of how to realize the track prediction and tracking of the shielded object so as to completely track the motion track of the object.

To solve the above technical problem, an embodiment of the present invention provides a multi-target tracking method, including: during the tracking of a plurality of targets, acquiring a tracking result of a first target in the plurality of targets at the current moment; if the tracking result is unmatched tracking, determining a cluster to which the first target belongs according to the historical detection results of the targets, wherein the targets belonging to the same cluster have the same object attribute and similar motion attribute; and recovering the track of the first target at the current moment according to the detection result of the targets except the first target in the cluster at the current moment and the relative motion relation among the targets in the cluster.

Optionally, the obtaining a tracking result of a first target in the multiple targets at the current time includes: predicting the prediction result of the first target at the current moment according to the historical detection result of the first target; and performing target matching on the prediction result and an object detection result obtained by detection at the current moment to obtain a tracking result of the first target.

Optionally, for any one of the historical detection result, the prediction result, and the object detection result, the detection result is characterized based on detection frames, the detection frames correspond to dictionaries in a one-to-one manner, and the dictionaries include at least one of the following contents: tracking identity information of the object; the position, the size, the movement speed and the size change speed of the detection frame; a tracking state of the trajectory; number of frames the track is maintained.

Optionally, the performing target matching on the prediction result and the object detection result obtained by detecting at the current moment to obtain the tracking result of the first target includes: and when the tracking state of the track in the historical detection result of the first target is determined tracking, performing cascade matching on the prediction result and an object detection result obtained by detection at the current moment to obtain a tracking result of the first target.

Optionally, the cascade matching refers to performing target matching by using data association algorithms of multiple cost matrices, where the data association algorithms of the multiple cost matrices at least include a first data association algorithm and a second data association algorithm; the step of performing cascade matching on the prediction result and an object detection result obtained by detection at the current moment to obtain a tracking result comprises: performing target matching on the prediction result and the object detection result by adopting the first data association algorithm to obtain a first matching result; and if the first matching result is not a matching pair, performing target matching on the prediction result and the object detection result by adopting the second data association algorithm to obtain a second matching result, and determining the second matching result as the tracking result.

Optionally, the performing target matching on the prediction result and the object detection result obtained by detecting at the current time to obtain the tracking result of the first target includes performing target matching on the prediction result and the object detection result obtained by detecting at the current time by using a single data association algorithm to obtain the tracking result of the first target when the tracking state of the track in the historical detection result of the first target is an uncertain tracking state.

Optionally, the data association algorithm of the multiple cost matrices includes: IoU match and SIOA match.

Optionally, the determining, according to the historical detection results of the multiple targets, the cluster to which the first target belongs includes: determining the object attribute and the motion attribute of each target according to the historical detection result of each target; searching for a target which is close to the motion attribute of the first target, has the same object attribute and has a distance with the first target smaller than a preset threshold value from the plurality of targets; and if the searching is successful, determining that the searched target and the first target form the cluster.

Optionally, the object attribute includes an identity category of the target; the motion attributes include: a speed of movement of the target; a direction of motion of the target; a trajectory of the object.

Optionally, the method further includes: and if the first target cannot form a cluster with other targets in the plurality of targets, predicting the track of the first target at the current moment according to a uniform variable speed motion model.

Optionally, the method further includes: and if the first target reappears within the preset matching time limit and is matched with the recovered or predicted track of the first target at the current moment, determining the tracking state of the track in the historical detection result of the first target as uncertain tracking, and performing tracking at the next moment.

Optionally, the method further includes: and if the tracking result of the first target is a matching pair, updating the prediction result of the first target at the current moment through improved Kalman filtering.

Optionally, before determining the cluster to which the first target belongs according to the historical detection results of the plurality of targets, the method further includes: and carrying out smoothing processing on the historical detection result through improved Kalman filtering.

To solve the foregoing technical problem, an embodiment of the present invention further provides a multi-target tracking apparatus, including: the acquisition module is used for acquiring a tracking result of a first target in a plurality of targets at the current moment during tracking the plurality of targets; the determining module is used for determining a cluster to which the first target belongs according to historical detection results of the multiple targets if the tracking result is unmatched tracking, wherein the multiple targets belonging to the same cluster have the same object attribute and motion attribute; and the recovery module is used for recovering the track of the first target at the current moment according to the detection result of the targets in the cluster except the first target at the current moment and the relative motion relation between the targets in the cluster.

To solve the above technical problem, an embodiment of the present invention further provides a storage medium, on which a computer program is stored, and the computer program executes the steps of the above method when being executed by a processor.

In order to solve the above technical problem, an embodiment of the present invention further provides a terminal, including a memory and a processor, where the memory stores a computer program capable of running on the processor, and the processor executes the steps of the method when running the computer program.

Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:

the embodiment of the invention provides a multi-target tracking method, which comprises the following steps: during the tracking of a plurality of targets, acquiring a tracking result of a first target in the plurality of targets at the current moment; if the tracking result is unmatched tracking, determining a cluster to which the first target belongs according to the historical detection results of the targets, wherein the targets belonging to the same cluster have the same object attribute and similar motion attribute; and recovering the track of the first target at the current moment according to the detection result of the targets except the first target in the cluster at the current moment and the relative motion relationship among the targets in the cluster. Therefore, by adopting the embodiment, the track of the shielded object can be predicted and tracked, and the completeness of target tracking is favorably ensured. Specifically, when the tracking result is an unmatched tracking, it indicates that although the position of the first target at the current time is predicted, the detection result (i.e., the detection frame) at the current time is not matched, that is, an occlusion or a loss of the detection frame occurs. Based on the embodiment, when the problems of shielding and detection frame loss in a multi-target motion scene occur, the targets are clustered by considering the relative motion relation among the targets, and then the track recovery of the shielded targets in the cluster is realized by sharing the characteristics of speed, acceleration and the like. Therefore, when the first target is retraced in the future, the occurrence of identity information matching failure can be effectively reduced.

Further, if the first target cannot form a cluster with other targets in the plurality of targets, predicting the track of the first target at the current moment according to a uniform variable speed motion model. Therefore, for an individual which cannot form a cluster, the motion state of the individual during the shielded period is predicted through the uniform variable speed model so as to realize track prediction, and further, the occurrence of identity information matching failure can be reduced.

Further, when the tracking state of the track in the historical detection result of the first target is determined, performing cascade matching on the prediction result and the object detection result detected at the current moment to obtain the tracking result of the first target. Therefore, the matching accuracy can be effectively improved.

Further, if the track of the first target at the current moment is still within the preset matching time limit, determining the tracking state of the track in the historical detection result of the first target as the tracking of the uncertain state, and performing the tracking at the next moment. Therefore, within the preset matching time limit, the predicted or recovered track of the first target at the current moment is set as the unacknowledged track and enters the matching at the next moment, so as to ensure that the track of the first target is continuously carried out during the period that the first target is occluded. After the occlusion of the first target is finished, identity information matching can be performed quickly based on the predicted or recovered track of the first target, so that the re-detected object can be matched to the historical track of the first target as soon as possible.

Furthermore, by adopting the embodiment, a plurality of targets around the unmanned automobile can be continuously tracked by utilizing the two-dimensional visual information, the three-dimensional detection result, the relative motion relation among the tracked targets and the motion model, so that the problems of track interruption, identity information conversion and the like caused by shielding among the targets in the multi-target tracking technology in the unmanned scene are solved. Specifically, when the vehicle or the person around the unmanned vehicle is shielded, the method and the device can perform track recovery or prediction on the shielded target so as to ensure that the tracking of the target around the unmanned vehicle is always complete, and therefore the unmanned vehicle is safer and more reliable.

Drawings

FIG. 1 is a flow chart of a multi-target tracking method according to an embodiment of the present invention;

FIG. 2 is a flowchart of one embodiment of step S101 of FIG. 1;

FIG. 3 is a flow chart of predicting the position and size of the detection box at the next time by improved Kalman filtering according to an embodiment of the present invention;

FIG. 4 is a flow chart of an occlusion recovery and trajectory prediction process according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a multi-target tracking device according to an embodiment of the present invention.

Detailed Description

As described in the background art, most of the existing tracking algorithms cannot predict and track the trajectory of an occluded object due to poor appearance characteristics, so that the motion trajectory of the object cannot be completely tracked.

To solve the above technical problem, an embodiment of the present invention provides a multi-target tracking method, including: during the period of tracking a plurality of targets, acquiring a tracking result of a first target in the plurality of targets at the current moment; if the tracking result is unmatched tracking, determining a cluster to which the first target belongs according to the historical detection results of the targets, wherein the targets belonging to the same cluster have the same object attribute and similar motion attribute; and recovering the track of the first target at the current moment according to the detection result of the targets except the first target in the cluster at the current moment and the relative motion relationship among the targets in the cluster.

Therefore, by adopting the embodiment, the track of the shielded object can be predicted and tracked, and the completeness of target tracking is favorably ensured. Specifically, when the tracking result is an unmatched tracking, it indicates that although the position of the first target at the current time is predicted, the detection result (i.e., the detection frame) at the current time is not matched, that is, an occlusion or a loss of the detection frame occurs. Based on the embodiment, when the problems of shielding and frame loss in a multi-target motion scene occur, the targets are clustered by considering the relative motion relation among the targets, and then the track recovery of the shielded targets in the cluster is realized by sharing the characteristics of speed, acceleration and the like. Therefore, when the first target is retraced in the future, the occurrence of identity information matching failure can be effectively reduced.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.

Fig. 1 is a flowchart of a multi-target tracking method according to an embodiment of the present invention.

The embodiment can be applied to the unmanned scene, for example, the multi-target tracking technology based on a monocular camera and a three-dimensional laser radar is adopted to monitor the surrounding environment of the unmanned automobile so as to obtain the motion trail of objects around the unmanned automobile. The embodiment can be integrated in an on-board controller of the unmanned automobile to realize real-time target tracking of objects around the unmanned automobile.

By adopting the embodiment, a plurality of targets around the unmanned automobile can be continuously tracked by utilizing two-dimensional visual information, a three-dimensional detection result, a relative motion relation between the tracked targets and a motion model, so that the problems of track interruption, identity information conversion and the like caused by shielding between the targets in the multi-target tracking technology in the unmanned scene are solved. Specifically, when the vehicle or the person around the unmanned vehicle is shielded, the method and the device can perform track recovery or prediction on the shielded target so as to ensure that the tracking of the target around the unmanned vehicle is always complete, and therefore the unmanned vehicle is safer and more reliable.

Referring to fig. 1, the multi-target tracking method according to this embodiment may include the following steps:

step S101, during tracking a plurality of targets, acquiring a tracking result of a first target in the plurality of targets at the current moment;

step S102, if the tracking result is unmatched tracking, determining a cluster to which the first target belongs according to the historical detection results of the multiple targets, wherein the multiple targets belonging to the same cluster have the same object attribute and similar motion attribute;

step S103, restoring the track of the first target at the current moment according to the detection result of the targets in the cluster except the first target at the current moment and the relative motion relationship among the targets in the cluster.

In step S101, a tracking result of each target at the current time is obtained, where each target may be used as the first target to execute a subsequent scheme.

Further, the target may be an object that is perceivable around the unmanned vehicle. The objects may include people, cars, animals, plants, traffic lights, and other traffic participants.

Further, the detection result may be characterized based on the detection box. The detection boxes are in one-to-one correspondence with dictionaries, and the dictionaries comprise at least one of the following contents: identity Information (ID) of the tracked object (i.e., target); the position, size, movement speed and size change speed (bbox) of the detection frame; the tracking status of the track (is _ committed _ or _ not); number of frames (age) for track maintenance. The corresponding relation between the dictionary and the detection frame can be stored in a database and obtained by calling when needed.

For example, the position of the detection frame may be represented by the center point of the detection frame.

For example, the three-dimensional boundary information of the object may be described by the size of the detection box.

For example, the tracking state of the trajectory includes tracking of a deterministic state and tracking of an indeterminate state. When the same target is detected in three continuous frames, the target corresponds to the dictionary of the detection frame, and the tracking state of the track is marked as the tracking of the determination state.

Further, the detection result may be a three-dimensional detection result, or may be a three-dimensional and two-dimensional detection result. The detection result may be obtained from a target detector whose input is environmental data around the drone vehicle collected by the monocular camera and/or radar.

Further, the tracking result may refer to a matching result. During tracking, the tracking result for each target at each time can be classified into unmatched detections (unmatched detections), unmatched tracks (unmatched tracks), and matched pairs (matches, also called matching success). Wherein the unmatched trace indicates that although the position of the target at the next time is predicted, the detection frame detected at the next time does not match; the detection of no match indicates that there is no detection on the match, which indicates that a new target to be tracked appears.

In one implementation, referring to fig. 2, the step S101 may include the following steps:

step S1011, predicting the prediction result of the first target at the current moment according to the historical detection result of the first target;

step S1012, performing target matching between the prediction result and the object detection result obtained by the current time detection to obtain the tracking result of the first target.

Specifically, in step S1011, the position and size of the detection frame at the next time (i.e., time a) can be predicted by the improved kalman filter (also referred to as the improved three-dimensional kalman filter) using the position and size of the detection frame at time (a-1).

Further, the previous time, the current time, and the next time may be determined according to the frame rate of the monocular camera, and the detection result obtained based on the single frame image corresponds to the detection result at a certain time.

Referring to fig. 3, a specific process for predicting the position and size of the detection box at the next time through the modified kalman filter may include the following steps:

step a, setting parameters. The parameters may include a state transition model F, an observation model H, an observation noise matrix R, and a process noise matrix Q.

For example, the state transition model F can be expressed in a matrix form as in equation (1):

dt in equation (1) represents the frame interval between the previous and next states, and in this example dt is equal to 1.

For example, the observation model H can be expressed in a matrix form as in equation (2):

for example, the observation noise matrix R and the process noise matrix Q may be adjusted according to the actual application.

Next, step b is performed, initializing the system state X0 and covariance matrix P0.

Next, step c is executed to predict the system, and the prediction process may include two aspects:

first, based on equation (3), the system state value at time k-1 is passed

Predicting system state values at time k

(i.e., predicted value):

wherein, F _k A state transition model F at the moment k; b is _k Is a transfer noise matrix.

In this example, the system status value corresponds to the detection result, i.e., the position and size of the detection frame.

Second, based on equation (4), the covariance of the error at time k-1

Predicting error covariance at time k

Wherein, F _k A state transition model F at time k; q _k A process noise matrix Q at time k;

is F _k The transposing of (1).

Next, step d is executed to determine whether the system state value at time k predicted in step c successfully matches the system state value at time k actually detected. That is, the content described in step S1012 is executed.

In step S1012, the object detection result (i.e., the system state value at the k time point actually detected) detected at the present time and the prediction result (i.e., the system state value at the k time point predicted in step c) may be processed differently according to the tracking state of the trajectory.

For example, when the tracking state of the track in the historical detection result of the first target is determined, the prediction result and the object detection result detected at the current time are subjected to cascade matching to obtain the tracking result of the first target. The cascade matching refers to target matching by adopting a data association algorithm of multiple cost matrixes. One model adopts a plurality of evaluation methods, and different cost matrixes can be understood as different evaluation methods.

For another example, when the tracking state of the track in the historical detection result of the first target is the uncertain tracking, a single data association algorithm is adopted to perform target matching on the prediction result and the object detection result detected at the current time, so as to obtain the tracking result of the first target.

The data association algorithm may include: an Intersection-Over-Unit (IoU) match and a Sum-of-Intersection-Over-target-Area (SIOA) match. Further, the data correlation algorithm may also include appearance matching (i.e., visual matching)

The data correlation algorithm of the multiple cost matrixes adopted by the cascade matching at least comprises a first data correlation algorithm and a second data correlation algorithm. Correspondingly, firstly, the prediction result and the object detection result are subjected to target matching by adopting the first data association algorithm to obtain a first matching result; and if the first matching result is not a matching pair, performing target matching on the prediction result and the object detection result by adopting the second data association algorithm to obtain a second matching result, and determining the second matching result as the tracking result.

For example, for tracking of uncertain states, the predicted result and the current detection result can be directly subjected to SIOA matching.

For example, for tracking of a certain state, the IOU matching is performed based on formula (5) to obtain a first matching result:

wherein IoU is the first matching result; area (a) is a region of the prediction result; area (B) is a region of the detection result.

The first matching result can be divided into matching pair 1(matches _1), unmatched detection (unmatched detections _1) and unmatched tracking (unmatched tracks _ 1).

Performing SIOA matching on the unmatched detection (unmatched detections _1) and the unmatched tracking (unmatched tracks _1) to obtain a second matching result, which follows the formula (6):

wherein SIOA is the second matching result.

The second matching result can also be divided into matching pairs (matches), unmatched detections (unmatched detections), and unmatched tracks (unmatched tracks). And the second matching result is the finally obtained tracking result of the current moment.

Further, the judgment result of the step d is positive, that is, the successful matching refers to a combined result of matching pairs obtained through the IOU matching and the SIOU matching.

The inventor of the application discovers through analysis that the appearance matching mode is usually adopted when the target matching is carried out in the prior art, the matching accuracy is seriously dependent on the imaging quality of a monocular camera, the appearance matching is inaccurate due to the fact that the environment is easily influenced, and the calculation complexity is high. In view of the above problem, the present embodiment adopts cascade matching, and can achieve high-accuracy target matching without acquiring appearance features such as color information and texture information of the target.

For the uncertain tracking, the SIOA is used for matching, and if the matching degree is smaller than a preset threshold value, the SIOA is directly discarded, so that the calculation amount is reduced.

In one variation, the cascade matching may be three stages, such as appearance matching, IOU matching, and SIOA matching in sequence.

Further, when the judgment result of the step d is affirmative, the step f updating process can be executed, and the updating process comprises the following three stages:

first, a kalman gain is calculated based on equation (7):

where K is the kalman gain, which is used to adjust the weight of the predicted value and the observed value (i.e., the detected value); h _k An observation model at the time k;

is H _k Transposing; r _k Is the observed noise matrix at time k.

Secondly, based on formula (8), the predicted value at the time k is corrected and updated according to the observed value at the time k:

wherein the content of the first and second substances,

the predicted value of the updated k moment is obtained;

the predicted value of k moment before updating; z is a linear or branched member _k And the observed value at the time k is the detection result.

Thirdly, updating the covariance matrix based on equation (9):

wherein the content of the first and second substances,

the covariance matrix at the k moment after updating;

is the covariance matrix at time k before updating.

The next time step c is performed, the updated predicted values and covariance matrix based on equations (8) and (9) are used. The predicted value can be corrected through an updating process.

Further, when the determination result of step d is negative, a smoothing process of step e may be performed, the smoothing process including the following three stages:

first, a kalman smoothing matrix is calculated based on equation (10):

wherein J is the Kalman smoothing matrix;

is the covariance matrix at time k + 1.

Secondly, based on equation (11), the updated value at the time K is corrected smoothly according to the smoothed value at the time K +1, that is, the state value of kalman backward propagation is calculated:

wherein, the first and the second end of the pipe are connected with each other,

the predicted value of the smoothed k moment is obtained;

the predicted value at the time k +1 after smoothing is obtained.

Thirdly, updating the covariance matrix at the moment k based on the formula (12), namely calculating the covariance of the Kalman back propagation error:

the covariance matrix at the k moment after updating;

the covariance matrix at the updated k +1 moment; j. the design is a square ^T Is the transpose of J.

The next time step c is performed, the updated prediction value and covariance matrix based on equations (11) and (12) are used.

In a specific implementation, if the tracking result of the first target at the current time obtained in step S101 is a matching pair, the updating process shown in step f in fig. 3 may be directly performed, so as to update the prediction result of the first target at the current time through the improved three-dimensional kalman filter, and then enter the matching at the next time.

In a specific implementation, if the tracking result of the first target at the current time obtained in step S101 is an unmatched tracking, occlusion recovery and trajectory prediction are performed. Wherein the occlusion recovery corresponds to step S102 and step S103.

Specifically, referring to fig. 4, the occlusion recovery and trajectory prediction process may include the steps of:

step S1021, smoothing the historical detection result through improved Kalman filtering;

step S1022, determining the object attribute and the motion attribute of each target according to the historical detection result of each target;

step S1023, searching for a target which has the same motion attribute and object attribute as the first target and has a distance smaller than a preset threshold value with the first target from the plurality of targets;

if the search is successful in step S1023, step S1024 is executed, it is determined that the searched target and the first target form the cluster, and then step S103 in fig. 1 is executed;

if the search is not successful in step S1023, that is, the first object cannot form a cluster with other objects in the plurality of objects, step S1025 is executed to predict the trajectory of the first object at the current time according to the uniform variable speed motion model.

Further, the smoothing process in step S1021 may refer to the related content in step e in fig. 3. The condition that the speed, the position and the size of the detection frame are incorrect due to low performance, instability and the like of the target detector can be eliminated through the smoothing operation. The step S1021 may be a preprocessing step for improving the overall calculation efficiency.

Further, the history detection result of each target used in step S1022 may be stored in a database in advance. For example, the detection results of the targets acquired at the time a-1 are stored in advance, and when a certain target is found to be occluded at the time a, the stored data is called to determine whether a cluster is formed. This can improve the calculation efficiency.

Further, the object attribute may include an identity category of the target; the motion attributes include: a speed of movement of the target; a direction of motion of the target; a trajectory of the object.

For example, the identity category of the target includes a person and a vehicle, and can be determined according to the identity information of the tracked object in the dictionary corresponding to the detection box.

For example, in step S1023, it may be determined whether the detection frame corrected in step S1021 can form a cluster based on the size, speed, and position information of the detection frame. For example, for the detected detection frames corresponding to all the targets, the detection frames with the same category and size are similar, the distance between the detection frames is smaller than the preset threshold, and the detection frames with the similar movement speed and the same movement direction can be classified into the same cluster. The same category means that people and people, vehicles and vehicles can form clusters, and people and vehicles cannot form clusters.

Further, the advantage of using the uniform variable speed model is that the calculation efficiency is high, because the unmanned driving scene requires fast response and real-time performance.

In one embodiment, during the period of predicting or recovering the track of the first target at the current time based on steps S1021 to S1025 and S103, if the first target reappears within a preset matching time limit and matches with the recovered or predicted track of the first target at the current time, the tracking state of the track in the historical detection results of the first target may be determined as the uncertain state tracking, and the tracking at the next time may be performed.

In particular, with continued reference to fig. 4, step S1026 may be performed to verify whether the matching result is a successful match based on the visual characteristics. That is, whether the reappeared first target matches the recovered or predicted trajectory of the first target at the current time is determined based on the visual features.

For example, after the occlusion is finished, the first target reappears, and at this time, it may be determined whether the first target is still within the preset matching time limit. If the preset matching time limit is not exceeded, an appearance characteristic matrix can be constructed according to the visual characteristics, whether the tracking of the first target at the current moment can be minimized or not is verified, predicted or recovered, and the appearance characteristic matrix is obtained (if the two objects are the same, the matrix is 0, and if the two objects are different, the matrix is not 0), and the appearance characteristic matrix is used for representing the visual matching degree. Also for example, appearance matching may be replaced with IOU matching.

If the detection frame obtained by prediction or recovery can be matched with the appearance of the detection frame of the reappeared first target, the reappeared first target is indicated, and therefore the tracking state of the track in the historical detection result of the first target can be determined to be the tracking of the uncertain state, and the tracking at the next moment can be carried out.

And if the preset matching time limit is exceeded, the first target still does not appear or the appearance matching is unsuccessful, deleting the tracking record of the first target.

Further, the preset matching time limit may be determined according to a frame rate of the monocular camera, and the higher the frame rate of the input picture is, the closer the position distance between two adjacent frames of objects is. The monocular camera may be disposed on an unmanned vehicle.

Further, if the verification result of step S1026 is positive, that is, a matching pair, step f shown in fig. 3 is executed.

In one specific implementation, if the tracking result of the first target at the current time obtained in step S101 is a detection that is not matched because a new target appears, the new target is set as a new track (track) and initialized, and then the next time matching is performed.

Further, after creating a new trace and after performing step f, the newly created trace may be added to a trace list, where the trace list includes a trace condition, such as a track and a trace result, of each of the multiple targets.

Further, the step S101 may be restarted after the tracking list is updated until all the detection results are traversed.

By adopting the embodiment, the multi-target tracking algorithm based on the monocular camera and the 3D laser radar is used on the unmanned vehicle, so that the pedestrians or the vehicles can be accurately and stably tracked, the problems of track interruption and identity information conversion caused by shielding in the current multi-target tracking can be solved, the motion track of the shielded target is predicted, a basis is provided for the unmanned vehicle to adopt driving strategies such as following, overtaking, sudden stop and the like, and the safety driving of the vehicle is guaranteed.

Specifically, the embodiment provides a multi-target tracking technology based on a monocular camera and a 3D laser radar, and aims to solve the problems of occlusion and detection frame loss in a multi-target motion scene so as to improve the performance of indexes (IDF1 and IDs) for identity information in a tracking algorithm. On the premise of two-dimensional vision and three-dimensional point cloud detection results, the algorithm of the embodiment can greatly improve the accuracy of target matching; clustering the targets by considering the relative motion relation among the targets, and realizing the track recovery of the shielded targets in the cluster by sharing the characteristics of speed, acceleration and the like; for individuals incapable of forming clusters, the motion state of the individuals in the shielded period is predicted through the uniform variable speed model, so that the track prediction is realized, and further, the occurrence of identity information matching failure is reduced.

Fig. 5 is a schematic structural diagram of a multi-target tracking apparatus according to an embodiment of the present invention. Those skilled in the art will understand that the multi-target tracking device 5 of the present embodiment may be used to implement the method solutions described in the embodiments of fig. 1 to 4.

Specifically, referring to fig. 5, the multi-target tracking apparatus 5 according to the present embodiment may include: an obtaining module 51, configured to obtain, during tracking of multiple targets, a tracking result of a first target in the multiple targets at a current time; a determining module 52, configured to determine a cluster to which the first target belongs according to historical detection results of the multiple targets if the tracking result is unmatched tracking, where the multiple targets belonging to the same cluster have the same object attribute and similar motion attributes; a restoring module 53, configured to restore a track of the first target at the current time according to a detection result of the targets in the cluster, except the first target, at the current time and a relative motion relationship between the targets in the cluster.

For more details of the operation principle and the operation mode of the multi-target tracking device 5, reference may be made to the related descriptions in fig. 1 to 4, which are not repeated herein.

Further, the embodiment of the present invention also discloses a storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method technical solution described in the embodiments shown in fig. 1 to fig. 4 is executed. Preferably, the storage medium may include a computer-readable storage medium such as a non-volatile (non-volatile) memory or a non-transitory (non-transient) memory. The storage medium may include ROM, RAM, magnetic or optical disks, etc.

Further, an embodiment of the present invention further discloses a terminal, which includes a memory and a processor, where the memory stores a computer program that can be run on the processor, and the processor executes the technical solution of the method in the embodiment shown in fig. 1 to 4 when running the computer program. In particular, the terminal may be an onboard controller of the automobile.

The "plurality" appearing in the embodiments of the present application means two or more.

The descriptions of the first, second, etc. appearing in the embodiments of the present application are only for illustrating and differentiating the objects, and do not represent the order or the particular limitation of the number of the devices in the embodiments of the present application, and do not constitute any limitation to the embodiments of the present application.

Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A multi-target tracking method is characterized by comprising the following steps:

during the tracking of a plurality of targets, acquiring a tracking result of a first target in the plurality of targets at the current moment;

if the tracking result is unmatched tracking, determining a cluster to which the first target belongs according to historical detection results of the targets, wherein the targets belonging to the same cluster have the same object attribute and similar motion attribute;

and recovering the track of the first target at the current moment according to the detection result of the targets except the first target in the cluster at the current moment and the relative motion relationship among the targets in the cluster.

2. The multi-target tracking method according to claim 1, wherein the obtaining of the tracking result of the current time on the first target of the plurality of targets comprises:

predicting the prediction result of the first target at the current moment according to the historical detection result of the first target;

and performing target matching on the prediction result and an object detection result obtained by detection at the current moment to obtain a tracking result of the first target.

3. The multi-target tracking method according to claim 2, wherein for any one of the historical detection results, the predicted results and the object detection results, the detection result is characterized based on detection frames, the detection frames are in one-to-one correspondence with dictionaries, and the dictionaries comprise at least one of the following contents: tracking identity information of the object; the position, the size, the movement speed and the size change speed of the detection frame; a tracking state of the trajectory; number of frames the track is maintained.

4. The multi-target tracking method according to claim 2, wherein the performing target matching on the prediction result and an object detection result detected at the current time to obtain the tracking result of the first target comprises:

and when the tracking state of the track in the historical detection result of the first target is determined tracking, performing cascade matching on the prediction result and an object detection result obtained by detection at the current moment to obtain a tracking result of the first target.

5. The multi-target tracking method according to claim 4, wherein the cascade matching is target matching by adopting data association algorithms of multiple cost matrices, and the data association algorithms of the multiple cost matrices at least comprise a first data association algorithm and a second data association algorithm; the step of performing cascade matching on the prediction result and an object detection result obtained by current time detection to obtain a tracking result comprises:

performing target matching on the prediction result and the object detection result by adopting the first data association algorithm to obtain a first matching result;

and if the first matching result is not a matching pair, performing target matching on the prediction result and the object detection result by adopting the second data association algorithm to obtain a second matching result, and determining the second matching result as the tracking result.

6. The multi-target tracking method according to claim 2, wherein the performing target matching on the prediction result and an object detection result detected at the current time to obtain the tracking result of the first target comprises:

and when the tracking state of the track in the historical detection result of the first target is the tracking in an uncertain state, performing target matching on the prediction result and an object detection result obtained by detection at the current moment by adopting a single data association algorithm to obtain a tracking result of the first target.

7. The multi-target tracking method according to claim 5 or 6, wherein the data association algorithm of the multiple cost matrices comprises: IoU match and SIOA match.

8. The multi-target tracking method according to claim 1, wherein the determining the cluster to which the first target belongs according to the historical detection results of the plurality of targets comprises:

determining the object attribute and the motion attribute of each target according to the historical detection result of each target;

searching for a target which is close to the motion attribute of the first target, has the same object attribute and has a distance with the first target smaller than a preset threshold value from the plurality of targets;

and if the searching is successful, determining that the searched target and the first target form the cluster.

9. The multi-target tracking method according to claim 1 or 8, wherein the object attributes include an identity category of the target; the motion attributes include: a speed of movement of the target; a direction of motion of the target; a trajectory of the object.

10. The multi-target tracking method according to claim 1, further comprising:

and if the first target cannot form a cluster with other targets in the plurality of targets, predicting the track of the first target at the current moment according to a uniform variable speed motion model.

11. The multi-target tracking method according to claim 1 or 10, further comprising:

and if the first target reappears within the preset matching time limit and is matched with the track of the first target at the current moment obtained through recovery or prediction, determining the tracking state of the track in the historical detection result of the first target as the tracking of an uncertain state, and performing tracking at the next moment.

12. The multi-target tracking method according to claim 1, further comprising:

and if the tracking result of the first target is a matching pair, updating the prediction result of the first target at the current moment through improved Kalman filtering.

13. The multi-target tracking method according to claim 1, further comprising, before determining the cluster to which the first target belongs based on the historical detection results of the plurality of targets:

and carrying out smoothing processing on the historical detection result through improved Kalman filtering.

14. A multi-target tracking apparatus, comprising:

the acquisition module is used for acquiring a tracking result of a first target in a plurality of targets at the current moment during tracking the plurality of targets;

the determining module is used for determining a cluster to which the first target belongs according to historical detection results of the plurality of targets if the tracking result is unmatched tracking, wherein the plurality of targets belonging to the same cluster have the same object attribute and motion attribute;

and the recovery module is used for recovering the track of the first target at the current moment according to the detection result of the targets in the cluster except the first target at the current moment and the relative motion relationship among the targets in the cluster.

15. A storage medium having a computer program stored thereon, the computer program, when being executed by a processor, performing the steps of the method according to any one of claims 1 to 13.

16. A terminal comprising a memory and a processor, the memory having stored thereon a computer program operable on the processor, wherein the processor, when executing the computer program, performs the steps of the method of any of claims 1 to 13.