CN117237401A

CN117237401A - Multi-target tracking method, system, medium and equipment for fusion of image and point cloud

Info

Publication number: CN117237401A
Application number: CN202311473562.3A
Authority: CN
Inventors: 陈雪梅; 徐泽源; 薛杨武; 肖龙; 赵小萱; 沈晓旭
Original assignee: Beijing Institute of Technology BIT; Advanced Technology Research Institute of Beijing Institute of Technology
Current assignee: Beijing Institute of Technology BIT; Advanced Technology Research Institute of Beijing Institute of Technology
Priority date: 2023-11-08
Filing date: 2023-11-08
Publication date: 2023-12-15
Anticipated expiration: 2043-11-08
Also published as: CN117237401B

Abstract

The invention relates to the technical field of multi-target tracking, and discloses a multi-target tracking method, a system, a medium and equipment for fusing images and point clouds, comprising the following steps: acquiring an image and a point cloud of a target to be tracked at the current moment to obtain a fused target, an unfused three-dimensional target and an unfused two-dimensional target; based on the track of the last moment stored in the three-dimensional track library or the two-dimensional track library, predicting the track of the current moment, and then carrying out multi-level association on the fused target, the unfused three-dimensional target and the unfused two-dimensional target to obtain association detection of the current moment; after updating the track at the current moment, adding a three-dimensional track library or a two-dimensional track library; the multi-level association is implemented by adopting geometric perception cost to construct an association matrix, wherein the geometric perception cost comprises Euclidean distance cost, target direction cost and multi-category cost. The time consumption of multi-target tracking is reduced.

Description

Multi-target tracking method, system, medium and equipment for fusion of image and point cloud

Technical Field

The invention relates to the technical field of multi-target tracking, in particular to a multi-target tracking method, a system, a medium and equipment for fusing images and point clouds.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

Along with the development of automobiles, the development of the automobiles is towards the development of intellectualization, new energy and the like, the intelligent driving of the birth of the intellectualization is an important direction of the current and future development of the automobiles, and the intelligent driving has important significance in relieving traffic jams, improving the running efficiency of traffic, reducing various traffic accidents caused by negligence of drivers and improving the science and technology strength. Taking intelligent driving energy optimization and emission management as an example, the technology can optimize the fuel consumption in the driving process of the vehicle and can save the fuel consumption by nearly 10 percent. The active safety technology for improving the running safety of the vehicle, such as the technologies of lane keeping, lane departure early warning, automatic emergency braking systems and the like, can give early warning to a driver and actively control such as braking before an accident occurs, and can relieve traffic accidents caused by human driver negligence or uncomfortable elements such as fatigue driving and the like. The intelligent automobile control system has the advantages that the intelligent automobile control system is a function which can be realized at present, according to the final completion form of the intelligent automobile, the control of the automobile can be completed without supervision of a human driver, uncontrollable elements of the human driver are completely avoided, driving time of the driver can be saved, other things can be done, and traveling efficiency is greatly improved.

Although many excellent methods exist in the research of intelligent driving target detection and tracking technology, most of the methods are very time-consuming, and cannot achieve good balance in precision and speed, and in particular, deployment and practical application are difficult on an intelligent driving computing platform with limited computational power. The single-mode tracking method has the problems of insufficient association and poor environmental adaptability; however, the multi-objective tracking method may employ a complex cost function, which may cause a large time consumption.

Disclosure of Invention

In order to solve the problems, the invention provides the multi-target tracking method, the system, the medium and the equipment for fusing the images and the point clouds, which discard the time-consuming three-dimensional cross-blending ratio cost function, but build a lighter cost function depending on the geometric information such as the position, the size, the rotation and the like of the three-dimensional detection frame, and reduce the time-consuming multi-target tracking.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

a first aspect of the present invention provides a multi-target tracking method of image and point cloud fusion, comprising:

acquiring an image and a point cloud of a target to be tracked at the current moment to obtain a fused target, an unfused three-dimensional target and an unfused two-dimensional target;

based on the track of the last moment stored in the three-dimensional track library or the two-dimensional track library, predicting the track of the current moment, and then carrying out multi-level association on the fused target, the unfused three-dimensional target and the unfused two-dimensional target to obtain association detection of the current moment;

based on the association detection of the current moment, adding a three-dimensional track library or a two-dimensional track library after updating the track of the current moment;

the multi-level association adopts geometric perception cost to construct an association matrix, wherein the geometric perception cost comprises Euclidean distance cost, target direction cost and multi-category cost; the Euclidean distance cost is the ratio of the Euclidean distance of two associated objects to the target size; the target direction cost is determined by adopting a trigonometric function mode according to the rotation angle difference value of the two associated objects; the multi-category cost is used for controlling the weight of the target direction cost in the geometric sense cost.

Further, the prediction and update of the track at the current moment are based on a Kalman filtering algorithm and a constant speed motion model.

Further, different association stages of the multi-stage association employ different thresholds.

Further, the first-stage association of the multi-stage association is to associate the fusion target with the track at the current moment, which is obtained based on the track prediction at the last moment stored in the three-dimensional track library, based on the geometric perception cost, so as to obtain an associated track, an associated detection, an unassociated track and unassociated detection.

Further, the second stage of association of the multi-stage association includes: and carrying out association again on the track at the current moment, which is obtained based on the track prediction at the last moment and stored in the three-dimensional track library, in the first-stage association of the multi-stage association based on the geometric perception cost, so as to obtain an association track, association detection, an unassociated track and unassociated detection.

Further, the second stage of association of the multi-stage association includes: and based on the geometric perception cost, correlating the unfused three-dimensional target with the track at the current moment predicted based on the track at the last moment stored in the three-dimensional track library to obtain a correlated track, a correlated detection, a non-correlated track and a non-correlated detection.

Further, the third-stage association of the multi-stage association is to associate the unfused two-dimensional target with the track of the current moment predicted based on the track of the last moment stored in the two-dimensional track library based on the intersection ratio, so as to obtain an associated track, an associated detection, an unassociated track and an unassociated detection.

Further, the method further comprises the following steps: and after the three-dimensional track in the three-dimensional track library is projected to the image plane, correlating the projected track with the two-dimensional track in the two-dimensional track library, and deleting the correlated two-dimensional track from the two-dimensional track library.

A second aspect of the invention provides a multi-target tracking system for image and point cloud fusion, comprising:

a detection fusion module configured to: acquiring an image and a point cloud of a target to be tracked at the current moment to obtain a fused target, an unfused three-dimensional target and an unfused two-dimensional target;

an association module configured to: based on the track of the last moment stored in the three-dimensional track library or the two-dimensional track library, predicting the track of the current moment, and then carrying out multi-level association on the fused target, the unfused three-dimensional target and the unfused two-dimensional target to obtain association detection of the current moment;

a track update module configured to: based on the association detection of the current moment, adding a three-dimensional track library or a two-dimensional track library after updating the track of the current moment;

A third aspect of the present invention provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps in a multi-target tracking method of fusion of images and point clouds as described above.

A fourth aspect of the invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and running on the processor, the processor implementing the steps in a multi-target tracking method of fusion of images and point clouds as described above when the program is executed.

Compared with the prior art, the invention has the beneficial effects that:

the invention provides a multi-target tracking method for fusion of images and point clouds, which is characterized in that in the design of a cost function dependent on association, a time-consuming 3D (three-dimensional) IoU cost function is abandoned, a lighter cost function is constructed by depending on geometric information such as the position, the size and the rotation of a 3D detection frame, and finally, the association of front and rear frame detection is completed by utilizing Hungary matching, so that the time consumption is reduced.

The invention provides a multi-target tracking method for fusion of images and point clouds, which utilizes a fusion detection algorithm to obtain a more accurate 3D fusion detection frame, constructs a four-level association structure together with an unfused detection frame, adopts different thresholds at different association stages to fully associate detection of front and rear frames, and adapts to different environments.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.

Fig. 1 is a flowchart of a multi-target tracking method of image and point cloud fusion according to a first embodiment of the present invention.

Detailed Description

The invention will be further described with reference to the drawings and examples.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The embodiments of the present invention and features of the embodiments may be combined with each other without conflict, and the present invention will be further described with reference to the drawings and embodiments.

Example 1

An object of the first embodiment is to provide a multi-target tracking method of image and point cloud fusion.

The multi-target tracking method for image and point cloud fusion provided by the embodiment tries to be relieved in a mode of image and point cloud fusion tracking, and meanwhile, the instantaneity of a tracking algorithm is maintained to meet the requirements of actual scene application.

The multi-target tracking method for fusion of the image and the point cloud provided by the embodiment, as shown in fig. 1, comprises the steps of fusion detection, association, track updating and prediction, track library management and the like. Fusion algorithm IAfusion (Fusion of Fusion ratio perception IoU-Aware Fusion) is relied on to obtain a Fusion target, an unfused three-dimensional (3D) target and an unfused two-dimensional (2D) target; carrying out multi-level track association according to the fusion result, wherein four-level association is adopted to fully match and associate the fusion result; according to the characteristics of different fusion results, designing an appearance-free cost function with shorter reasoning time, wherein the function focuses more on the characteristics of the geometric and the like of the detection frame; the method comprises the steps that a Kalman filtering and constant speed motion model is adopted in track updating and predicting to update a predicted track; the track library management comprises a 3D track library and a 2D track library, and respectively manages the 2D track and the 3D track.

And step 1, fusion detection.

Specifically, an image of a target to be tracked at the current moment and a point cloud of the target to be tracked are obtained, and a fused target, an unfused 3D target and an unfused 2D target are obtained through fusion detection.

Wherein, fusion detection includes: detecting an image of a target to be tracked to obtain a 2D detection result; detecting point cloud of a target to be tracked to obtain a 3D detection result; and carrying out preliminary fusion on the 2D detection result and the 3D detection result through a fusion algorithm.

The fusion algorithm comprises the following steps: projecting the 3D detection result onto an image plane, then calculating IoU (cross ratio, intersection over Union) value with a 2D detection frame (2D detection result) to obtain an incidence matrix, resolving the incidence matrix through hungarian matching and greedy algorithm, and finally setting a IoU threshold value on the resolving result to obtain a final fusion result, wherein three types of targets generally appear after the fusion processing: the method comprises the steps of fusing a result (i.e. a fused target) with 3D and 2D information, detecting a point cloud only with 3D information (i.e. an unfused 3D target), detecting an image only with 2D information (i.e. an unfused 2D target), carrying out different tracking processing according to different fused results, and carrying out multi-target tracking method design according to the three results.

And 2, associating.

And predicting the track at the current moment based on the track at the last moment stored in the 3D track library or the 2D track library, and performing multistage association on the fused target, the unfused 3D target and the unfused 2D target based on the track at the current moment respectively to obtain association detection at the current moment.

For the three results (the fused target, the unfused 3D target and the unfused 2D target) after the fusion detection, in order to fully correlate the front frame target and the rear frame target, multi-level correlation can be adopted for tracking, because the detection hierarchical correlation with high confidence and low confidence is favorable for preferentially matching more reliable detection, and the ByteTrack (Multi-Object Tracking by Associating Every Detection Box, multi-target tracking associated with each detection frame) algorithm has better effect.

In the embodiment, the four-level association based on deep fusion mot (three-dimensional Multi-target tracking frame based on depth association camera laser radar fusion, 3D Multi-Object Tracking Framework Based on Camera-LiDAR Fusion with Deep Association) is partially modified, different association thresholds are introduced, a 3D track library and a 2D track library are constructed for the fused targets, and the structure of the four-level association can be referred to fig. 1.

(1) First level association: the fusion targets are associated, and the first-level association mainly associates the fusion targets. The first level of association is to associate the fusion target with the track at the current moment predicted based on the track at the last moment stored in the three-dimensional track library based on the geometric perception cost, and obtain an associated track, an associated detection, an unassociated track and an unassociated detection.

For the fusion target, since both the image and the point cloud are detected, the reliability thereof is high, and thus, the preferential correlation is performed. The associated cost function employed by the first level of association is a geometrically aware cost function. The first level of association will produce three associated results: associated tracks and detections, unassociated tracks, unassociated detections (unassociated fusion targets), for which subsequent track updating and prediction steps will be performed, while unassociated tracks and detections will continue to attempt to associate at the second level of association.

(2) Second level association: unfused point cloud 3D detection association, and the second level association is mainly used for associating unfused point cloud detection (i.e. unfused 3D targets) in the fusion detection process.

Specifically, the second level of association includes: based on geometric perception cost, performing relevance again on the non-relevance detection in the first-stage relevance of the multi-stage relevance and the current-moment track obtained based on the track prediction of the last moment stored in the three-dimensional track library to obtain relevance tracks, relevance detection, non-relevance tracks and non-relevance detection; and (3) based on the geometric perception cost, correlating the track which is not fused with the three-dimensional target and is obtained based on the track prediction of the last moment stored in the three-dimensional track library at the current moment to obtain a correlated track, a correlated detection, a non-correlated track and a non-correlated detection.

The unfused 3D object is put into the second stage for correlation because only one sensor of the point cloud is detected and the reliability is relatively low. In addition, fusion detection (i.e. unassociated detection) with no matching of the first-stage association can be further associated at this stage, and the adopted association cost function is a geometrically-aware cost function.

The second level of association also produces three associated results: associated tracks and detections, unassociated tracks, unassociated detections.

In theory, the detection and the track in the first-stage association are not matched, and even in the second-stage association, the matching is difficult, but in the embodiment, different association thresholds are set in two stages (the first-stage association and the second-stage association), namely, the first-stage association adopts a high threshold value, the second-stage association adopts a low threshold value, such as in experiments, the threshold value adopted by the first-stage association is 1.0, and the threshold value adopted by the second-stage association is 1.3, so as to alleviate the problem of difficult matching caused by sudden acceleration of a target or error accumulation after disappearance for a period of time.

(3) Third level association: unfused image 2D detection association, the third level of association is mainly for associating unfused image 2D detection (i.e. unfused 2D objects) in the fusion process. That is, the third-stage association is to associate, based on the cross-correlation, the non-fused two-dimensional target with the track at the current time predicted based on the track at the previous time stored in the two-dimensional track library, thereby obtaining an associated track, an associated detection, a non-associated track, and a non-associated detection.

Because the unfused 2D target lacks 3D information, the third-level association constructs a 2D track, and the adopted association cost function is also a common IoU cost function.

The main reason for constructing a 2D trajectory is that the detection of the point cloud at a distance is not very advantageous, but for images the detection at a distance will be more accurate than the point cloud. Therefore, for the 2D detection of the unfused image, a 2D track is firstly constructed, then 3D information corresponding to the 2D track is acquired and then is moved to a 3D track library, and for the 3D track, the establishment of the 3D track can be more clear because of the partial tracking track.

(4) Fourth level association: the 3D track library is associated with the 2D track library, the purpose of the fourth level of association is to associate the 3D track with the 2D track, and the associated 2D track is moved to the 3D track library for management.

The fourth level of correlation mainly projects the 3D track to the image plane, and then performs matching judgment by using IoU values between the projected 2D detection frame and the 2D track detection frame of the image (namely, correlates the projected track with the track in the two-dimensional track library); the associated 2D track will be deleted from the 2D track library and the management tracking will be performed in the 3D track library.

Cost function for geometric perception employed in the first and second level associations: in order to correlate targets of previous and subsequent frames, a relationship between tracked trajectories of previous frames and current frame detection needs to be established, which relationship is typically controlled by a cost function. For 2D detection targets, the association between the front frame and the rear frame is generally established through IoU values of the detection frame and appearance characteristics of the detection frame, wherein IoU values are used for distinguishing different targets of the front frame and the rear frame according to detection positions, and when a plurality of target positions are closely overlapped, appearance characteristics are needed for further distinguishing the targets. The method is similar to the method in the point cloud 3D detection target, and a 3D IoU value and the appearance of the point cloud can be adopted, but the benefit brought by appearance characteristics of the point cloud is generally smaller than that of image characteristics due to the sparsity of data of the point cloud. In addition, the 3D target is free from shielding, has richer geometric position information, and can achieve a good effect by utilizing IoU information generally. However, the calculation of the 3D IoU value is also relatively time consuming due to the rotation of the detection frame existing in the 3D space, and in addition, the obtained IoU value is 0 for the detection frame which is not overlapped, which is easy to generate ID switching for those objects which are suddenly accelerated, have no overlapping object at the detection and prediction positions of the current frame, or have disappeared for several frames. Therefore, according to the original geometric information of the 3D detection frame, a faster correlation function is designed by utilizing information such as the center distance, the size, the rotation angle and the like of the BEV (Bird's Eye View) View angle, so as to construct a correlation matrix.

Specifically, for the detection of the current time t (fused or unfused 3D object obtained in step 1)The definition is as follows: />Wherein->The number of targets detected (i.e. the number of fused targets or unfused 3D targets obtained);a target detection frame (in this embodiment, simply referred to as a detection frame) at the current time t; />Is->Center positions of the target detection frames; />Is->The width, height and length of each target detection frame; />Is->The direction angles of the target detection frames; />Is->The class of the object in the individual object detection boxes.

For the prediction frame of the track associated with the previous frame at the current time t (the detection frame obtained by track prediction of Kalman filtering algorithm, namely the track at the current time)The definition is as follows: />Wherein->The number of tracks which are already established and in an active state for the current time t; />A target track prediction frame at the current time t; />Is->Center positions of the target track prediction frames; />Is->Width, height and length (target size) of each target track prediction frame; />Is->The direction angles of the target track prediction frames; />Is->The categories of the objects in the individual object track prediction boxes. An association matrix may be constructed based on parameters of the detection frame and the prediction frame of the current frame t to match the same object. The geometric sense cost function constructed in this embodiment mainly includes three parts: euclidean distance costs based on target size, target direction costs, and multi-category costs.

(1) A euclidean distance cost function based on the target size.

The Euclidean distance cost function (simply called Euclidean distance cost) based on the target size is the ratio of the Euclidean distance of two associated objects to the target size.

The present embodiment correlates at the BEV viewing angle using a main correlation function that is the euclidean distance that discards the height information, i.e., the center distance of two correlated objects (detection and prediction boxes) at the BEV viewing angle. In addition, considering that the sizes of different targets, such as vehicles, pedestrians and cyclists, are large in the aspect of length, width and height, if the common Euclidean distance is adopted, different thresholds are required to be set for each type of target, the cost of parameter adjustment is increased undoubtedly. Therefore, the distance is normalized according to the length of the target detection frame on the basis of the Euclidean distance, and in order to adapt to targets with different sizes, the Euclidean distance correlation function based on the target size is finally obtained, which is specifically defined as:。

(2) A cost function based on the direction angle.

In order to further distinguish between different objects, the direction angle of the detection frame is also considered, the present embodiment determines the direction cost between the two frames by means of trigonometric functions based on the rotation angle difference of the two frames,for the firstDetection frame and->The direction cost of each prediction frame is defined as follows: />。

(3) Based on multi-class cost functions.

And a multi-category cost for controlling the weight of the target direction cost in the geometric sense cost, wherein in the embodiment, when the categories of the two associated objects are different, the multi-category cost is 0, and the weight of the target direction cost is 0, i.e. the target direction cost does not add the geometric sense cost.

In order to prevent interference between different classes when multi-class object tracking is performed, the embodiment adds multi-class cost based on the above cost function, and does not bring additional cost penalty for the detection frame of the same class, but for different classes, in order to prevent the objects from matching together, a higher penalty cost is given, specifically, the multi-class cost is defined as:。

the cost function used in this embodiment is based on a combination of the above cost functions, i.e. the geometric sense cost function is specifically defined as:。

based on the geometric sense cost function, a target association cost matrix (association matrix) of the front frame and the rear frame can be constructed：Wherein->The number of targets currently detected and the number of tracking tracks currently detected respectively.

Based on the target association cost matrix, the relationship between targets can be obtained by performing a solution by using a Hungary matching or greedy algorithm.

And 4, track updating and prediction, namely predicting and updating the track state based on the Kalman filtering and the constant speed motion model.

Specifically, the previous track stored in the 3D track library or the 2D track library is based on the previous track before the primary association, the secondary association or the tertiary associationTrack of the predicted current time t>The method comprises the steps of carrying out a first treatment on the surface of the Track based on predicted time t>Performing primary association, secondary association or tertiary association to obtain association detection +.>The method comprises the steps of carrying out a first treatment on the surface of the Correlation detection based on time t>And the predicted trajectory at time t +.>By means of the Kalman filtering update phase, track +.>Update and update the track +.>And adding a 3D track library or a 2D track library.

For an associated track, the state of its current moment needs to be updated. There are two types of track states acquired at the current time: the track predicts the current time state according to the previous time state and the target state detected by the current time sensor, and certain errors exist in the two states, especially for the predicted state, the errors are relatively larger, and for the current sensor detection, the errors are relatively smaller. In order to obtain a more accurate and smooth trajectory, the present embodiment employs a Kalman filtering algorithm (KF) to update the final trajectory state.

Before updating the state of the object, the motion model and the motion state of the object need to be determined first. The constant velocity motion model adopted in this embodiment, that is, the motion velocity of the object is considered constant for a period of time, and the target may be approximately considered constant in the range of the acquisition period of the laser radar data. Whereas for a pair of already associated trajectories and detections at the current moment:

the predicted state parameters of the tracking trajectory are defined as:these parameters represent the predicted state of the track at the current frame, wherein +.>For predicting the center position of the frame (track at the current moment,/-)>For predicting the size of the frame +.>The direction angle of the prediction frame, the last three parameters +.>Is a track at +.>Speeds in three directions;

the detected state parameters are:，/>for detecting the central position of a frame (fused target, unfused three-dimensional target or unfused two-dimensional target), the +.>For detecting the size of the frame->In order to detect the direction angle of the frame, the speed of the target cannot be detected, and the current speed needs to be calculated according to the position of the previous frame of the track.

Based on the above state parameters, the present embodiment employs a kalman filter algorithm to determine the final detection state.

The Kalman filtering algorithm is mainly divided into two stages of prediction and updating, wherein the motion state and the error covariance of the current frame of the track are mainly predicted in the prediction stage, and the updating stage algorithm confirms the final state according to the predicted state and the observed state. Specifically, based on the state at the previous timeAnd error covariance of the state +.>State parameter +.>Error covariance +>The specific formula of the Kalman filtering prediction stage is as follows: />，Wherein->For state transition matrix>Is a process noise covariance matrix.

The main equations for the Kalman filter update phase are as follows:，，/>wherein->For measuring matrix, for spatial conversion between prediction and measurement,/->For Kalman gain, ++>For measuring noise covariance +.>Is an identity matrix.

According to the motion model and parameters selected in the embodiment, the state transition matrixAnd measurement matrix->Is defined as follows: />，/>Wherein->For the acquisition period of the sensor, +.>Is thatIdentity matrix of>Is->Is a zero matrix of (c).

Process noise covariance matrixAnd measuring the noise covariance matrix->The method comprises the following steps of: />，Wherein->And->Weights of the noise covariance matrix and the measurement noise covariance matrix are respectively 0.01 and 1.0 in the experiment.

Initial error covarianceThe definition is as follows: />。

And 5, managing a 3D track library and a 2D track library based on fusion.

In order to track the fused 2D and 3D targets, a 3D track library and a 2D track library are constructed, respectively. The track library is used for managing tracks constructed in the tracking process, and the tracks need to be managed under the conditions of initialization, disappearance for a period of time due to shielding, target leaving the field of view and the like.

The basic track management policy adopted in this embodiment is: a track is consecutively associated with 3 frames to be identified as a valid track in order to avoid the effect of false detection as much as possible, whereas a certain track is considered to be deleted out of view if consecutive 25 frames are not associated. Based on this basic trajectory management strategy, the present embodiment considers higher-score detection with higher reliability, and for such detection, the present embodiment adopts a strategy that is considered to determine a trajectory once it starts to be established. The same strategy is used for 3D and 2D trajectory management. Furthermore, according to the fourth level of correlation, if a corresponding track is found in the 3D track library in the 2D track library, the 2D track is deleted and a determined track is established in the 3D track library.

Example two

An object of the second embodiment is to provide a multi-target tracking system for fusing an image and a point cloud, including:

a track library management module configured to: after the three-dimensional track in the three-dimensional track library is projected to an image plane, correlating the projected track with the two-dimensional track in the two-dimensional track library, and deleting the correlated two-dimensional track from the two-dimensional track library;

It should be noted that, each module in the embodiment corresponds to each step in the first embodiment one to one, and the implementation process is the same, which is not described here.

Example III

The present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps in the multi-target tracking method of fusion of an image and a point cloud as described in the above embodiment.

Example IV

The present embodiment provides a computer device, including a memory, a processor, and a computer program stored on the memory and running on the processor, where the processor executes the program to implement the steps in the multi-target tracking method for fusion of an image and a point cloud according to the above embodiment.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.

Claims

1. The multi-target tracking method for fusing the image and the point cloud is characterized by comprising the following steps of:

2. The method of multi-objective tracking for image and point cloud fusion of claim 1, wherein different correlation stages of the multi-stage correlation employ different thresholds.

3. The method for multi-target tracking of image and point cloud fusion of claim 1, wherein the prediction and update of the trajectory at the current time is based on a kalman filter algorithm and a constant velocity motion model.

4. The multi-target tracking method of image and point cloud fusion according to claim 1, wherein the first level of association of the multi-level association is that, based on the geometric perception cost, the fusion target and a track of a current moment obtained based on track prediction of a previous moment stored in a three-dimensional track library are associated to obtain an associated track, an associated detection, an unassociated track and an unassociated detection;

the second stage of association of the multi-stage association comprises: and carrying out association again on the track at the current moment, which is obtained based on the track prediction at the last moment and stored in the three-dimensional track library, in the first-stage association of the multi-stage association based on the geometric perception cost, so as to obtain an association track, association detection, an unassociated track and unassociated detection.

5. The method of multi-objective tracking for image and point cloud fusion of claim 1, wherein the second level of association of the multi-level association comprises: and based on the geometric perception cost, correlating the unfused three-dimensional target with the track at the current moment predicted based on the track at the last moment stored in the three-dimensional track library to obtain a correlated track, a correlated detection, a non-correlated track and a non-correlated detection.

6. The multi-target tracking method based on the image and the point cloud fusion according to claim 1, wherein the third level of association of the multi-level association is to associate the unfused two-dimensional target with a track at the current time predicted based on a track at the last time stored in a two-dimensional track library based on an intersection ratio, and obtain an associated track, an associated detection, an unassociated track and an unassociated detection.

7. The method of multi-target tracking for fusion of images and point clouds as recited in claim 1, further comprising: and after the three-dimensional track in the three-dimensional track library is projected to the image plane, correlating the projected track with the two-dimensional track in the two-dimensional track library, and deleting the correlated two-dimensional track from the two-dimensional track library.

8. The image and point cloud fusion multi-target tracking system is characterized by comprising:

9. A computer readable storage medium having stored thereon a computer program, the program being executed by a processor, characterized in that the program when executed by the processor performs the steps in the multi-target tracking method of fusion of images and point clouds as claimed in any one of claims 1-7.

10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor performs the steps in the multi-objective tracking method of fusion of images and point clouds as claimed in any one of claims 1-7 when the program is executed.