CN115170602A - Online multi-target tracking method and device and storage medium - Google Patents

Online multi-target tracking method and device and storage medium Download PDF

Info

Publication number
CN115170602A
CN115170602A CN202210768721.1A CN202210768721A CN115170602A CN 115170602 A CN115170602 A CN 115170602A CN 202210768721 A CN202210768721 A CN 202210768721A CN 115170602 A CN115170602 A CN 115170602A
Authority
CN
China
Prior art keywords
target
tracking
detected
stage
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210768721.1A
Other languages
Chinese (zh)
Inventor
徐彪
桂瀚洋
李勇
董健
刘飞龙
王峻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Hongjing Zhijia Information Technology Co ltd
Tongji University
Original Assignee
Shanghai Hongjing Zhijia Information Technology Co ltd
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hongjing Zhijia Information Technology Co ltd, Tongji University filed Critical Shanghai Hongjing Zhijia Information Technology Co ltd
Priority to CN202210768721.1A priority Critical patent/CN115170602A/en
Publication of CN115170602A publication Critical patent/CN115170602A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/223Analysis of motion using block-matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details

Abstract

The invention relates to the technical field of artificial intelligence and automatic driving, and particularly discloses an online multi-target tracking method, which comprises the following steps: collecting a current frame image containing a target to be detected; acquiring an enclosing frame and confidence coefficient of the target to be detected from the current frame image; matching the target to be detected with the tracking object in a preset tracking list according to the confidence coefficient of the target to be detected surrounding frame; and outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list. The invention also discloses an online multi-target tracking device. The on-line multi-target tracking method provided by the invention can combine the advantages of image level association matching and association matching under three-dimensional coordinates, implement three-stage matching and effectively reduce the problems of mismatching and ID switching.

Description

Online multi-target tracking method and device and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence and automatic driving, in particular to an online multi-target tracking method, an online multi-target tracking device and a storage medium.
Background
The multi-target tracking technology is widely applied to the fields of automatic driving, auxiliary driving, intelligent transportation and the like. In the related art, the method for matching and tracking based on the detection frame information or the key point information in the image has poor tracking capability for partially-occluded or transiently-completely-occluded targets.
In the related art, the method for tracking the detected target at the image level can perform correlation matching at the image level by using the original information of the detection model, but when the target is partially shielded or momentarily and completely shielded, the object motion modeling is inconsistent with the actual motion of the object to cause the divergence of the motion model, and the target cannot be subjected to correlation matching when appearing again.
In the related art, another method is a method for directly tracking a detection target by restoring camera calibration information to a three-dimensional coordinate, and due to the problems that the original information of an image detection frame cannot be well utilized in a matching stage and the uncertainty of image ranging is high, mismatching is easy to occur, and the tracking ID is switched.
Disclosure of Invention
In order to solve the defects in the prior art, the invention provides an online multi-target tracking method which can combine the advantages of image level association matching and association matching under three-dimensional coordinates, implement three-stage matching and effectively reduce the problems of mismatching and ID switching.
As a first aspect of the present invention, there is provided an online multi-target tracking method, including the steps of:
collecting a current frame image containing a target to be detected;
acquiring the bounding box of the target to be detected and the confidence coefficient of the bounding box from the current frame image;
matching the target to be detected with the tracking object in a preset tracking list according to the confidence coefficient of the target to be detected surrounding frame;
and outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list.
Further, matching the target to be detected with the tracked object in a preset tracking list according to the confidence of the target to be detected enclosure frame, further comprising:
dividing the target to be detected into a high-confidence target to be detected and a low-confidence target to be detected according to the confidence of the target enclosure frame to be detected;
performing first-stage matching on the target to be detected with high confidence and the confirmed tracked object in the preset tracking list, and outputting a target to be detected which is not matched in the first stage, a tracked object which is not matched in the first stage and a tracking detection pair which is matched in the first stage;
forming a second-stage target list of the targets to be detected which are not matched in the first stage and the targets to be detected with low confidence coefficient, and forming a second-stage tracking list of tracking objects which are not matched in the first stage and unconfirmed tracking objects in the preset tracking list;
performing second-stage matching on the target to be detected in the second-stage target to be detected list and the tracked object in the second-stage tracking list, and outputting a target to be detected which is not matched in the second stage, a tracked object which is not matched in the second stage and a tracking detection pair which is matched in the second stage;
and performing third-stage matching on the target to be detected and the tracking object which are not matched in the second stage in a three-dimensional coordinate system, and outputting the target to be detected, the tracking object which are not matched and the tracking detection pair which are matched on the target to be detected and the tracking object which are not matched in the third stage.
Further, the outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list further includes:
comparing the confidence coefficient of the target to be detected on the unmatched third stage with a set threshold;
if the target is higher than the set threshold, a new tracking object is created according to the target to be detected;
and if the target is lower than the set threshold, discarding the target to be detected.
Further, the matching the target to be detected with the tracking object in the preset tracking list further includes:
calculating the similarity between the target to be detected and the tracked object in the preset tracking list to obtain a matching result between the target to be detected and the tracked object in the preset tracking list;
calculating the similarity of the first stage and the second stage by calculating the intersection-parallel ratio IOU between the bounding box of the target to be detected and the bounding box of the tracked object; according to the calculated intersection and intersection ratio IOU between the bounding box of the target to be detected and the bounding box of the tracked object, outputting the matching results of the first stage and the second stage;
and finally, outputting a matching result of the third stage according to the Mahalanobis distance between the target to be detected and the tracking object in the camera coordinate system or the vehicle coordinate system.
Further, the similarity calculation of the first stage and the second stage by calculating the intersection-to-parallel ratio IOU between the bounding box of the target to be detected and the bounding box of the tracked object further includes:
predicting bounding box information of the tracking object in the first-stage and second-stage tracking lists through a 2D Kalman filter, and outputting bounding box prediction information of the tracking object;
and calculating the intersection ratio IOU between the bounding box information of the target to be detected and the bounding box prediction information of the tracked object.
Further, after the similarity calculation of the third-stage matching acquires the 3D pose information of the target to be detected and the 3D pose information of the tracked object which are not matched in the second stage in advance, the method further includes:
establishing a 3D Kalman filter according to the 3D pose information of the target to be detected;
predicting the 3D pose information of the tracking object in the third stage through the 3D Kalman filter, and outputting the 3D pose prediction information of the tracking object;
and calculating the similarity between the 3D pose information of the target to be detected and the 3D pose prediction information of the tracking object.
Further, the kalman filter formula is as follows:
the time update equation:
Figure BDA0003726601020000031
Figure BDA0003726601020000032
the state update equation:
Figure BDA0003726601020000033
Figure BDA0003726601020000034
Figure BDA0003726601020000035
wherein x is k-1 And
Figure BDA0003726601020000036
posterior state estimated values at the time k-1 and the time k respectively and updated results;
Figure BDA0003726601020000037
the estimation value of the prior state at the moment k is the intermediate calculation result of filtering, namely the result of k moment predicted according to the optimal estimation at the moment k-1 is the result of a prediction equation;
wherein, P k-1 And P k The covariance of the a posteriori estimates, x, for time k-1 and k, respectively k-1 And x k Represents the uncertainty of the state, is one of the results of the filtering;
wherein the content of the first and second substances,
Figure BDA0003726601020000038
estimating covariance for the prior error at time k, which is the intermediate calculation result of filtering; k k The gain matrix is a filtered intermediate calculation result, namely Kalman gain; q is the covariance of the process excitation noise, which is used to represent the error between the state transition matrix and the actual process; r is the measurement noise covariance;
Figure BDA0003726601020000039
correcting the priori obtained posterior for the residual of actual observation and predicted observation, and the Kalman gain; a is a state transition matrix, B is a control matrix, and is related to the constructed motion model; z is a radical of k Is the observed quantity at time k.
Further, the outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list further includes:
judging whether the tracked object in the preset tracking list is successfully matched with the target to be detected or not;
if so, updating the tracking objects in the preset tracking list by using the paired targets to be detected, accumulating the tracking confirmation time of the paired tracking objects, and outputting the tracking states of the paired tracking objects according to the accumulated tracking confirmation time of the paired tracking objects; wherein the tracking state of the tracked object comprises a confirmed state and an unconfirmed state;
if not, further judging the tracking loss time of the unpaired tracking object in the preset tracking list, if the tracking loss time is greater than a set threshold, deleting the unpaired tracking object from the preset tracking list, and if the tracking loss time is less than or equal to the set threshold, outputting the predicted value of the unpaired tracking object to a back-end module.
As a second aspect of the present invention, there is provided an online multi-target tracking apparatus, including:
the acquisition module is used for acquiring a current frame image containing a target to be detected;
the acquisition module is used for acquiring the bounding box of the target to be detected and the confidence coefficient of the bounding box from the current frame image;
the matching module is used for matching the target to be detected with the tracking object in a preset tracking list according to the confidence coefficient of the target to be detected surrounding frame;
and the output module is used for outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list.
As a third aspect of the present invention, a computer-readable storage medium is provided, having stored thereon computer instructions for causing the computer to perform the steps of the method as described hereinbefore.
The on-line multi-target tracking method provided by the invention has the following advantages: the method can better process the situation that the target is shielded in real-time tracking, and improve the accuracy and robustness of target tracking.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention.
Fig. 1 is a schematic diagram of a framework of an online multi-target tracking method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of the first stage matching according to the embodiment of the present invention.
Fig. 3 is a schematic diagram of second-stage matching according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of third-stage matching according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of tracking object management according to an embodiment of the present invention.
Fig. 6 is a schematic diagram of updating a tracking object according to an embodiment of the present invention.
Fig. 7 is a flowchart illustrating an online multi-target tracking method according to an embodiment of the present invention.
Detailed Description
To further illustrate the technical means and effects of the present invention adopted to achieve the predetermined objects, the following detailed description will be given to the embodiments, structures, features and effects of the on-line multi-target tracking method, device and storage medium according to the present invention, with reference to the accompanying drawings and preferred embodiments. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without any inventive step, are within the scope of the present invention.
In this embodiment, an online multi-target tracking method is provided, as shown in fig. 7, the online multi-target tracking method includes:
collecting a current frame image containing a target to be detected;
acquiring an enclosing frame and confidence coefficient of the target to be detected from the current frame image;
matching the target to be detected with a tracking object in a preset tracking list according to the confidence coefficient of the target to be detected surrounding frame;
and outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list.
Preferably, as shown in fig. 2 to 4, the matching the target to be detected with the tracking object in the preset tracking list according to the confidence of the target to be detected bounding box further includes:
dividing the target to be detected into a high-confidence target to be detected 203 and a low-confidence target to be detected 202 according to the confidence of the target to be detected surrounding frame 201;
performing first-stage matching 204 on the target to be detected with high confidence and a confirmed tracking object 209 in the preset tracking list 208, and outputting a target to be detected 205 which is not matched in the first stage, a tracking object 206 which is not matched in the first stage and a tracking detection pair 207 which is matched in the first stage;
forming a second-stage target list 303 by using the target to be detected 302 which is not matched in the first stage and the target to be detected 301 with low confidence coefficient, and forming a second-stage tracking list by using the tracking object 305 which is not matched in the first stage and the unconfirmed tracking object 306 in the preset tracking list 208;
performing second-stage matching 304 on the target to be detected in the second-stage target to be detected list and the tracking object in the second-stage tracking list, and outputting a target to be detected 308 which is not matched in the second stage, a tracking object 309 which is not matched in the second stage and a tracking detection pair 310 which is matched in the second stage;
and performing third-stage matching 403 on the target to be detected 401 and the tracked object 404 which are not matched in the second stage in a three-dimensional coordinate system, and outputting a target to be detected 406, a tracked object 409 and a matched tracking detection pair 410 which are not matched in the third stage.
In the embodiment of the present invention, as shown in fig. 1 to 4, a detection model is applied to infer a current frame image to obtain an bounding box 201 of at least one target to be detected and a confidence thereof; calculating the 3D pose information of the target to be detected in a camera coordinate system or a vehicle coordinate system by combining camera internal and external reference calibration information stored on the storage device and the detection surrounding frame information; in the correlation matching stage of the tracked object and the target to be detected, matching in three stages is carried out in sequence, so that mismatching and ID switching can be well reduced; in the first stage, the target to be detected with high confidence level is matched with the determined tracking object, so that mismatching is effectively reduced; in the second stage, the low-confidence detection target and the unmatched target in the first stage are matched with the unmatched tracking object in the first stage and the unmatched tracking object in the first stage, so that ID switching caused by discarding the low-confidence detection target due to shielding can be reduced; in the third stage, the three-dimensional information of the detection target on the unmatched second stage under the camera or vehicle coordinate system is matched with the pose information of the confirmed tracking object on the unmatched second stage on the 3D Kalman filtering prediction, so that ID switching caused by temporary target shielding and divergence of 2D Kalman filtering prediction can be reduced.
In the embodiment of the invention, the pose information part of the target to be detected in the camera coordinate system or the vehicle coordinate system is obtained by calculation according to the enclosing frame of the target to be detected and the camera internal and external parameters. The camera internal and external parameters are obtained by a camera calibration parameter matrix stored on a memory. The position relation between the image point and the vehicle coordinate system meets the following formula:
Figure BDA0003726601020000051
where u, v are image pixel coordinates, f x ,f y ,c x ,c y The camera internal reference calibration information can be acquired by storing the camera internal reference calibration information in a storage device. The R matrix is a camera external reference rotation matrix, wherein theta is a camera pitch angle,
Figure BDA0003726601020000052
is the camera yaw angle and psi is the camera roll angle.
Figure BDA0003726601020000053
Translation matrix
Figure BDA0003726601020000061
Where h is the camera mounting height.
In the embodiment of the invention, the pose of the target to be detected in the camera or vehicle coordinate system can be calculated by combining the preset width and height of the target to be detected or combining lane lines and the like. The invention is not limited thereto.
In the embodiment of the present invention, as shown in fig. 2, according to the confidence of the target to be detected, the target to be detected of the current frame is divided into two groups, one group is a high confidence group 203, and the other group is a low confidence group 202, by combining the confidence threshold value stored in the storage device, where the confidence threshold value is a parameter that can be adjusted and set according to the tracker result.
In the embodiment of the present invention, as shown in fig. 2, in the first stage of association matching, the target to be detected with high confidence level is associated and matched with the confirmed object in the tracked object list. The high confidence of the targets to be detected represents that the existence probability of the targets to be detected is high, and the targets to be detected should be matched with the tracking object list preferentially. The confirmed tracked object represents that the tracked object has been stably tracked for a period of time, is a tracked object determined by the previous frames, and is more likely to appear in the current frame, so that the tracked object is preferentially matched with the target to be detected with high confidence.
Preferably, the outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list further includes:
comparing 412 the confidence of the to-be-detected target 406 on the third stage mismatch with a set threshold;
if the target is higher than the set threshold, a new tracking object is created 411 according to the target to be detected;
if it is below the set threshold, the object to be detected is discarded 413.
Preferably, the matching the target to be detected with the tracking object in the preset tracking list further includes:
calculating the similarity between the target to be detected and the tracking object in the preset tracking list to obtain a matching result between the target to be detected and the tracking object in the preset tracking list;
as shown in fig. 2-3, the similarity calculation of the first stage and the second stage is performed by calculating an intersection-to-parallel ratio IOU between the bounding box of the target to be detected and the bounding box of the tracked object; according to the calculated intersection and intersection ratio IOU between the bounding box of the target to be detected and the bounding box of the tracked object, outputting the matching results of the first stage and the second stage;
it should be noted that the similarity may also be calculated by CIOU, DIOU, GIOU or other methods, and the invention is not limited thereto.
It should be noted that the matching similarity score matrix is filtered by a threshold, and the matching score below the threshold is set to 0, which is to prevent the mismatch. The threshold is stored in a storage device and is a parameter that can be adjusted by a configuration file. And optimizing the similarity to find out the best matching result.
And finally, outputting a matching result of the third stage according to the Mahalanobis distance between the target to be detected and the tracking object in the camera coordinate system or the vehicle coordinate system.
Preferably, the 3D pose information includes position and orientation information. The similarity of the third-stage matching can also be obtained by calculating the orientation similarity of the target to be detected and the tracked object.
In the embodiment of the invention, the optimization of the similarity score matrix matched in the first stage, the second stage or the third stage can adopt Hungarian matching or a KM algorithm.
Preferably, the similarity calculation in the first stage and the similarity calculation in the second stage both calculate an intersection-to-intersection ratio IOU between the bounding box of the target to be detected and the bounding box of the tracked object, and further include:
predicting 210 bounding box information of the tracking object in the first-stage tracking list and predicting 307 bounding box information of the tracking object in the second-stage tracking list by using a 2D Kalman filter, and outputting bounding box prediction information of the tracking object;
as shown in fig. 2, the bounding box information of the tracking object matched in the first stage is the bounding box information filtered by the 2D kalman filter. The 2D Kalman filter predicts the position of a tracking object possibly appearing in the current frame in a prediction stage, and the prediction time difference is the time difference between the current frame and the previous frame.
And calculating the intersection ratio IOU between the bounding box information of the target to be detected and the bounding box prediction information of the tracked object.
As shown in fig. 2, the input quantity of the 2D kalman filter in the embodiment of the present invention includes coordinates of an image of a center point of an enclosure of an object to be detected, an aspect ratio, and height information. The aspect ratio is obtained by calculating the width and the height of the enclosing frame of the target to be detected. The output of the Kalman filter comprises the real-time central point position, the aspect ratio, the height and the corresponding speed of the target to be detected on the image.
The kalman filter includes initialization, kalman prediction, and kalman update. The Kalman initialization is used for creating a corresponding Kalman filter when a tracking object is created; at kalman filter initialization, the initial velocity may be set to 0. And the Kalman filtering prediction is used for performing association matching on the predicted value of the tracking object and the target to be detected in the matching stage. And the Kalman filtering updating is used for updating the information of the successfully matched tracking object.
As shown in fig. 4, in the third stage of matching, under the condition of gradual occlusion or transient complete occlusion, the 2D kalman filter that tracks at the image level cannot model and predict the motion of the target to be detected at the image level well, and the result is easy to diverge, so that the target cannot be matched when appearing again on the image.
Preferably, after the similarity calculation of the third-stage matching previously acquires the 3D pose information of the target to be detected and the 3D pose information of the tracked object, which are not matched in the second stage, the method further includes:
A3D Kalman filter is established through the 3D pose information of the target to be detected, and the motion of the target to be detected in a camera coordinate system or a vehicle coordinate system can be well modeled and predicted. When the target to be detected appears after being shielded, the correlation matching can be carried out through the 3D pose information, and because other targets which possibly interfere with the target matching at the moment are matched in the previous two stages, the mismatching is not easy to occur when the 3D pose prediction is used for matching at the moment.
Predicting 407 the 3D pose information of the tracking object at the third stage through the 3D Kalman filter, and outputting 3D pose prediction information of the tracking object;
it should be noted that the tracking object 404 on the second stage unmatched includes a confirmed tracking object 405 and an unconfirmed tracking object 406, and the bounding box information of the confirmed tracking object 405 is input into the 3D kalman filter for prediction 407; adding 1 to the tracking lost time of the unconfirmed tracked object 406 (step 414);
and calculating the similarity between the 3D pose information of the target to be detected and the 3D pose prediction information of the tracking object.
It should be noted that the third-stage matching score is obtained by calculating a detection result of the target to be detected, combining with external and internal parameters of the camera, and calculating the similarity between the pose of the target in the camera coordinate system or the vehicle coordinate system and the state value of the tracked object after prediction by the 3D kalman filter. The third stage matching is based on an observation that the matching is under the camera or vehicle coordinate system: the confirmed tracking object can exist all the time under a vehicle or a camera coordinate system and cannot disappear through a blank space; and the motion modeling of the target under the camera or vehicle coordinate system is more in line with the physical reality.
Preferably, the kalman filter formula is as follows:
the time update equation:
Figure BDA0003726601020000081
Figure BDA0003726601020000082
the state update equation:
Figure BDA0003726601020000083
Figure BDA0003726601020000084
Figure BDA0003726601020000085
wherein x is k-1 And
Figure BDA0003726601020000086
posterior state estimated values at the k-1 moment and the k moment respectively and updated results;
Figure BDA0003726601020000087
the prior state estimation value at the moment k is the intermediate calculation result of filtering, namely the result of k moment predicted according to the optimal estimation at the moment k-1 is the result of a prediction equation;
wherein, P k-1 And P k The covariance of the a posteriori estimates, x, for time k-1 and k, respectively k-1 And x k Represents the uncertainty of the state, is one of the results of the filtering;
wherein the content of the first and second substances,
Figure BDA0003726601020000088
estimating covariance for the prior error at time k, which is the intermediate calculation result of filtering; k k The gain matrix is a filtered intermediate calculation result, namely Kalman gain; q is the covariance of the process excitation noise, which is used to represent the error between the state transition matrix and the actual process; r is the measurement noise covariance;
Figure BDA0003726601020000089
correcting the priori obtained posterior for the residual errors of actual observation and predicted observation together with Kalman gain; a is a state transition matrix, B is a control matrix, and is related to the constructed motion model; z is a radical of k Is the observed quantity at time k.
It should be noted that, regarding the 2D kalman filter part, the embodiment of the present invention discloses a state quantity of a 2D kalman filter of a constant velocity model. The 2D kalman filter state space of the present embodiment represents only one example and does not represent the scope of the present patent. It will be appreciated that those skilled in the art can easily extend this to constant acceleration models, constant steering velocity models, constant steering acceleration models, or degenerate it to velocity interpolation models.
Measuring matrix
Figure BDA00037266010200000810
Wherein x is center ,y center The center coordinates of a bounding box on the image are taken as the target to be detected; wherein a is the ratio of the width to the height of an enclosing frame of the target to be detected on the image; h is the height of the bounding box of the target to be detected on the image.
As an example of a constant velocity model, the state update equation can be constructed by:
Figure BDA0003726601020000091
state transition matrix
Figure BDA0003726601020000092
Where Δ t is the time difference.
It should be noted that, regarding the 3D kalman filter part, the embodiment of the present invention discloses a state quantity of a 3D kalman filter of a constant velocity model. The 3D kalman filter state space of the present embodiment represents only one example and does not represent the scope of the present patent. It will be appreciated that those skilled in the art can easily extend this to constant acceleration models, constant steering velocity models, constant steering acceleration models, or degenerate it to velocity interpolation models.
As an example of a constant velocity model, the state update equation can be constructed by:
Figure BDA0003726601020000093
state transition matrix
Figure BDA0003726601020000094
Where Δ t is the time difference.
Specifically, in the tracking manager, the prediction section is a prediction including a 3D kalman filter and a prediction section of a 2D kalman filter for the tracking object. The time difference Δ t is the time difference between the previous frame and the current frame. The prediction part comprises adding 1 to the life cycle of the tracked object; the prediction part includes the last update time from the tracking object (time _ sequence _ update) plus 1.
Specifically, in the tracking manager, the update section is an update section including a 2D kalman filter and a 3D kalman filter for the tracking object. The updating part comprises adding 1 to the life cycle of the tracking object; adding 1 to the updated times of the tracking object; the distance last update time (time _ sequence _ update) from the tracking object tracking state is reset to 0.
Specifically, in the tracking manager, the tracking state update of the tracking object is determined according to the updated times of the tracking object, the time since the last update and the information of the life cycle. When the tracking object is created, the tracking state of the tracking object is an unconfirmed state. And if the difference value between the life cycle and the updated times of the tracked object in the unconfirmed state is greater than a set threshold value, the tracked object in the unconfirmed state is considered to be in a lost state, and the threshold value is stored in a memory and can be set through a configuration file. When the updated number of times of the tracked object in the unconfirmed state is greater than a set threshold value, the tracked object in the unconfirmed state is updated to the confirmed state, and the threshold value is stored in a memory and can be set through a configuration file. For a tracked object in a confirmed state, if the distance from the last update time (time _ sequence _ update) is greater than a set threshold, the tracking state is updated to a lost state, and the threshold is stored in a memory and can be set through a configuration file. For the tracking object in the lost state, if the detection object is not matched, deleting the tracking object from the tracking object list; and if the detection object is matched, updating the tracking object information.
Specifically, in the tracking manager, the tracking object creating part compares the confidence of the target to be detected on the third-stage matching mismatch with a set threshold stored in the memory. And if the position of the target to be detected in the camera coordinate system or the vehicle coordinate system is higher than the threshold value, a tracking object is created according to the target to be detected, the 2D Kalman filter is initialized according to the bounding box information of the target to be detected, and the 3D Kalman filter is initialized according to the bounding box of the target to be detected and the pose information of the target to be detected in the camera coordinate system or the vehicle coordinate system, which is obtained by calculating the inside and outside parameters of the camera. If the target is lower than the threshold value, the target to be detected is regarded as false detection and discarded. The creation of the trace object also includes initialization of trace states, including: the life cycle is set to 1; the time _ sequence _ update from the last update time is set to 0; the number of updated times is set to 0.
Preferably, the outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list further includes:
judging whether the tracked object in the preset tracking list is successfully matched with the target to be detected or not;
if so, updating the tracking objects in the preset tracking list by using the paired targets to be detected, accumulating the tracking confirmation time of the paired tracking objects, and outputting the tracking states of the paired tracking objects according to the accumulated tracking confirmation time of the paired tracking objects; wherein the tracking state of the tracked object comprises a confirmed state and an unconfirmed state;
if not, further judging the tracking loss time of the unpaired tracking object in the preset tracking list, if the tracking loss time is greater than a set threshold, deleting the unpaired tracking object from the preset tracking list, and if the tracking loss time is less than or equal to the set threshold, outputting the predicted value of the unpaired tracking object to a back-end module.
As shown in fig. 5, after a tracking object 501 in a tracking list is predicted 502, three-stage matching 503 is performed, for the matched tracking object, information of the tracking object is updated 504 by information of a paired target to be detected, and meanwhile, 1 (505) is added to tracking confirmation time, and whether the tracking object is confirmed or not is judged according to the tracking confirmation time of the tracking object. If the tracking confirmation time of the tracked object is greater than the set threshold 506, the tracked object is in the confirmed state 507. And adding 1 to the tracking loss time (509) of the unmatched tracking objects in the tracking list, further judging the tracking loss time 510, if the tracking loss time is greater than a set threshold, deleting the tracking objects 511 from the tracking list, and if the tracking loss time is less than or equal to the set threshold, outputting a tracking object predicted value 512 to a back-end module.
As shown in fig. 6, for the tracking detection pair 604 in the aforementioned three-stage matching, the 2D information of the object 605 to be detected updates 606 the 2D kalman filter part in the tracked object 607. The 3D information 608 generated by the target to be detected through the camera extrinsic-extrinsic calculation updates 609 the 3D kalman filter part in the tracked object.
As another embodiment of the present invention, there is provided an online multi-target tracking apparatus including:
the acquisition module is used for acquiring a current frame image containing a target to be detected;
the acquisition module is used for acquiring the bounding box of the target to be detected and the confidence coefficient of the bounding box from the current frame image;
the matching module is used for matching the target to be detected with the tracking object in a preset tracking list according to the confidence coefficient of the target to be detected surrounding frame;
and the output module is used for outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list.
The embodiment of the invention also provides a non-transitory computer storage medium, wherein the computer storage medium stores computer executable instructions which can execute the online multi-target tracking method in any method embodiment.
The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD), a Solid State Drive (SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.
Although the present invention has been described with reference to a preferred embodiment, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. An on-line multi-target tracking method is characterized by comprising the following steps:
collecting a current frame image containing a target to be detected;
acquiring an enclosing frame and confidence coefficient of the target to be detected from the current frame image;
matching the target to be detected with the tracking object in a preset tracking list according to the confidence coefficient of the target to be detected surrounding frame;
and outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list.
2. The on-line multi-target tracking method according to claim 1, wherein the matching of the target to be detected and the tracked object in a preset tracking list is performed according to the confidence of the target to be detected bounding box, further comprising:
dividing the target to be detected into a target to be detected with high confidence coefficient and a target to be detected with low confidence coefficient according to the confidence coefficient of the target to be detected surrounding frame;
performing first-stage matching on the target to be detected with high confidence and the confirmed tracking object in the preset tracking list, and outputting a target to be detected which is not matched in the first stage, a tracking object which is not matched in the first stage and a tracking detection pair which is matched in the first stage;
forming a second-stage target list of the targets to be detected which are not matched in the first stage and the targets to be detected with low confidence coefficient, and forming a second-stage tracking list of tracking objects which are not matched in the first stage and unconfirmed tracking objects in the preset tracking list;
performing second-stage matching on the target to be detected in the second-stage target to be detected list and the tracking object in the second-stage tracking list, and outputting a target to be detected which is not matched in the second stage, the tracking object which is not matched in the second stage and a tracking detection pair which is matched in the second stage;
and performing third-stage matching on the target to be detected and the tracking object which are not matched in the second stage in a three-dimensional coordinate system, and outputting the target to be detected, the tracking object which are not matched and the tracking detection pair which are matched on the target to be detected and the tracking object which are not matched in the third stage.
3. The on-line multi-target tracking method according to claim 2, wherein the outputting of the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list further comprises:
comparing the confidence coefficient of the target to be detected on the unmatched third stage with a set threshold;
if the target is higher than the set threshold, a new tracking object is created according to the target to be detected;
and if the target is lower than the set threshold, discarding the target to be detected.
4. The on-line multi-target tracking method according to claim 2, wherein the matching of the target to be detected with the tracked object in a preset tracking list further comprises:
calculating the similarity between the target to be detected and the tracked object in the preset tracking list to obtain a matching result between the target to be detected and the tracked object in the preset tracking list;
calculating the similarity of the first stage and the second stage by calculating the intersection ratio IOU between the bounding box of the target to be detected and the bounding box of the tracked object; according to the calculated intersection and intersection ratio IOU between the bounding box of the target to be detected and the bounding box of the tracked object, outputting the matching results of the first stage and the second stage;
and the similarity calculation of the third stage matching acquires the 3D pose information of the target to be detected and the 3D pose information of the tracked object which are not matched in the second stage in advance, calculates the Mahalanobis distance between the target to be detected and the tracked object which are not matched in the second stage in a camera coordinate system or a vehicle coordinate system, and finally outputs the matching result of the third stage according to the Mahalanobis distance between the target to be detected and the tracked object in the camera coordinate system or the vehicle coordinate system.
5. The on-line multi-target tracking method according to claim 4, wherein the similarity calculation performed in the first stage and the similarity calculation performed in the second stage are performed by calculating an intersection-to-intersection ratio IOU between a bounding box of the target to be detected and a bounding box of the tracked object, and further comprising:
predicting bounding box information of the tracking object in the first-stage and second-stage tracking lists through a 2D Kalman filter, and outputting bounding box prediction information of the tracking object;
and calculating the intersection ratio IOU between the bounding box information of the target to be detected and the bounding box prediction information of the tracked object.
6. The on-line multi-target tracking method according to claim 4, wherein after the similarity calculation of the third-stage matching previously acquires the 3D pose information of the target to be detected and the 3D pose information of the tracked object, which are not matched in the second stage, the method further comprises:
establishing a 3D Kalman filter according to the 3D pose information of the target to be detected;
predicting the 3D pose information of the tracking object in the third stage through the 3D Kalman filter, and outputting the 3D pose prediction information of the tracking object;
and calculating the similarity between the 3D pose information of the target to be detected and the 3D pose prediction information of the tracking object.
7. The on-line multi-target tracking method according to claim 5 or 6, wherein the kalman filter formula is as follows:
the time update equation:
Figure FDA0003726601010000021
Figure FDA0003726601010000022
the state update equation:
Figure FDA0003726601010000023
Figure FDA0003726601010000024
Figure FDA0003726601010000025
wherein x is k-1 And
Figure FDA0003726601010000026
posterior state estimated values at the time k-1 and the time k respectively and updated results;
Figure FDA0003726601010000027
the prior state estimation value at the moment k is the intermediate calculation result of filtering, namely the result of k moment predicted according to the optimal estimation at the moment k-1 is the result of a prediction equation;
wherein, P k-1 And P k The covariance of the a posteriori estimates, x, for time k-1 and k, respectively k-1 And x k Represents the uncertainty of the state, is one of the results of the filtering;
wherein, P k - Estimating covariance for the prior error at time k, which is the intermediate calculation result of filtering; k k The gain matrix is a filtered intermediate calculation result, namely Kalman gain; q is the covariance of the process excitation noise, which is used to represent the error between the state transition matrix and the actual process; r is the measurement noise covariance;
Figure FDA0003726601010000028
correcting the apriori for the residuals of the actual and predicted observations, together with the Kalman gainThe posterior of arrival; a is a state transition matrix, B is a control matrix, and is related to the constructed motion model; z is a radical of k Is the observed quantity at time k.
8. The on-line multi-target tracking method according to claim 1, wherein the outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list further comprises:
judging whether the tracking objects in the preset tracking list are successfully matched with the target to be detected or not;
if so, updating the tracking objects in the preset tracking list by using the paired targets to be detected, accumulating the tracking confirmation time of the paired tracking objects, and outputting the tracking states of the paired tracking objects according to the accumulated tracking confirmation time of the paired tracking objects; wherein the tracking state of the tracked object comprises a confirmed state and an unconfirmed state;
if not, further judging the tracking loss time of the unpaired tracking object in the preset tracking list, if the tracking loss time is greater than a set threshold, deleting the unpaired tracking object from the preset tracking list, and if the tracking loss time is less than or equal to the set threshold, outputting the predicted value of the unpaired tracking object to a back-end module.
9. An online multi-target tracking device, comprising:
the acquisition module is used for acquiring a current frame image containing a target to be detected;
the acquisition module is used for acquiring the bounding box of the target to be detected and the confidence coefficient of the bounding box from the current frame image;
the matching module is used for matching the target to be detected with the tracking object in a preset tracking list according to the confidence coefficient of the target to be detected surrounding frame;
and the output module is used for outputting the tracking result of the target to be detected according to the matching result between the target to be detected and the tracking object in the preset tracking list.
10. A computer-readable storage medium storing computer instructions for causing a computer to perform the steps of the method of any one of claims 1-8.
CN202210768721.1A 2022-07-01 2022-07-01 Online multi-target tracking method and device and storage medium Pending CN115170602A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210768721.1A CN115170602A (en) 2022-07-01 2022-07-01 Online multi-target tracking method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210768721.1A CN115170602A (en) 2022-07-01 2022-07-01 Online multi-target tracking method and device and storage medium

Publications (1)

Publication Number Publication Date
CN115170602A true CN115170602A (en) 2022-10-11

Family

ID=83490013

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210768721.1A Pending CN115170602A (en) 2022-07-01 2022-07-01 Online multi-target tracking method and device and storage medium

Country Status (1)

Country Link
CN (1) CN115170602A (en)

Similar Documents

Publication Publication Date Title
CN110782494A (en) Visual SLAM method based on point-line fusion
CN113376650B (en) Mobile robot positioning method and device, electronic equipment and storage medium
CN104517275A (en) Object detection method and system
CN114049382B (en) Target fusion tracking method, system and medium in intelligent network connection environment
EP1835463A2 (en) Obstacle tracking apparatus and method
CN113313763B (en) Monocular camera pose optimization method and device based on neural network
CN115063454B (en) Multi-target tracking matching method, device, terminal and storage medium
JP2020052585A (en) Lane line recognition device
CN116645396A (en) Track determination method, track determination device, computer-readable storage medium and electronic device
CN114445453A (en) Real-time multi-target tracking method and system in automatic driving
CN110426714B (en) Obstacle identification method
US11948312B2 (en) Object detection/tracking device, method, and program recording medium
CN113932799A (en) Laser map updating method, system, electronic device, medium, and program product
CN115511970B (en) Visual positioning method for autonomous parking
WO2023072269A1 (en) Object tracking
CN116703962A (en) Multi-target tracking method and system
CN116563341A (en) Visual positioning and mapping method for processing dynamic object in complex environment
CN115170602A (en) Online multi-target tracking method and device and storage medium
CN114037977B (en) Road vanishing point detection method, device, equipment and storage medium
CN115359089A (en) Point cloud target tracking method, electronic device, medium and vehicle
CN115272393A (en) Video stream target tracking method and device for unmanned aerial vehicle and storage medium
Danescu et al. A stereovision-based probabilistic lane tracker for difficult road scenarios
CN113077495A (en) Online multi-target tracking method, system, computer equipment and readable storage medium
WO2023157623A1 (en) Information processing device, information processing method, and recording medium
Gomez-Ojeda et al. Accurate stereo visual odometry with gamma distributions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination