CN112862858A - Multi-target tracking method based on scene motion information - Google Patents

Multi-target tracking method based on scene motion information Download PDF

Info

Publication number
CN112862858A
CN112862858A CN202110047457.8A CN202110047457A CN112862858A CN 112862858 A CN112862858 A CN 112862858A CN 202110047457 A CN202110047457 A CN 202110047457A CN 112862858 A CN112862858 A CN 112862858A
Authority
CN
China
Prior art keywords
frame
target
detection
scene
motion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110047457.8A
Other languages
Chinese (zh)
Inventor
刘勇
翟光耀
孔昕
崔金浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202110047457.8A priority Critical patent/CN112862858A/en
Publication of CN112862858A publication Critical patent/CN112862858A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10044Radar image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention relates to a multi-target tracking method based on scene motion information, which comprises the following steps: s1: constructing a multi-target tracking system, which comprises a detection front-end module, a motion estimation module and a motion tracking module; s3: the detection front-end module acquires scene information and calculates and outputs detection results of a previous frame and a next frame; s4: inputting a pre-trained motion estimation module, performing data preprocessing, and calculating to obtain point-by-point motion information estimation of upper and lower frames; s5: the input motion tracking module is used for calculating the mean offset of the target-by-target detection frames, predicting the position of the next frame target-by-target detection frame, performing matching based on Hungarian algorithm, and respectively transmitting codes of the actual detection result of the last frame of each target successfully matched to the next frame detection frame of the same target to obtain multi-target tracking tracks between two frames; the method has the advantages of wide applicable scenes, simple calculation method, high precision and high speed, and does not need to adjust the hyper-parameters.

Description

Multi-target tracking method based on scene motion information
Technical Field
The invention belongs to the technical field of information and communication, and particularly relates to a multi-target tracking method based on scene motion information.
Background
The multi-target tracking task has been a long and challenging problem, aiming to locate objects in a piece of video sequence and assign the same instance with consistent coding. Many visual applications, such as autopilot, robot collision prediction and video face alignment, require multi-target tracking as their key component technologies. Recently, with the development of research related to three-dimensional object detection, multi-object tracking has been promoted.
Most end-to-end multi-target tracking methods face the problems of low precision and poor generalization capability. Although traditional filter-based approaches can achieve better results, it is difficult to give them optimal superparameters, and often fail in challenging cases.
In order to make the multi-target tracking method have better applicability, the multi-target tracking method needs to satisfy the following conditions:
(1) the method has wide application scenes and can be used for challenging scenes which cannot be used by the traditional method;
(2) the calculation method is simple, high in precision and high in speed;
(3) avoiding adjusting the hyper-parameters;
at present, no multi-target tracking method can solve the problems at the same time.
Disclosure of Invention
Based on the above defects in the prior art, the invention aims to provide a multi-target tracking method based on scene motion information, which has the advantages of wide applicable scenes, simple calculation method, high precision, high speed and no need of adjusting hyper-parameters.
The purpose of the invention can be realized by the following technical scheme:
a multi-target tracking method based on scene motion information is characterized by comprising the following steps:
step S1: constructing a multi-target tracking system; the multi-target tracking system comprises a detection front-end module, a motion estimation module and a motion tracking module; the detection front-end module comprises a laser radar sensor;
step S2: taking the 1 st frame as the previous frame, proceeding to steps S3 to S5;
step S3: the detection front-end module acquires scene information, calculates by adopting a PointRCNN method and outputs detection results of a previous frame and a next frame; the scene information comprises original point cloud information;
step S4: inputting original point cloud information of a previous frame and a next frame obtained by a front-end detection module into a pre-trained motion estimation module, preprocessing the original point cloud information, and calculating by using a FlowNet3D method to obtain point-by-point motion information estimation of the previous frame and the next frame;
step S5: estimating and inputting the point cloud information of the previous frame after the preprocessing of the motion estimation module and the point-by-point motion information obtained by the motion estimation module into the motion tracking module, and performing mean shift (delta x) of the target-by-target detection framen,Δyn,Δzn,Δθn) Calculating;
Δθnis the angular offset, Δ θ, of the nth target detection framenCalculating by adopting a constant angular velocity model; Δ xn,Δyn,ΔznIs the central three-dimensional coordinate offset, Δ x, of the nth target detection framen,Δyn,ΔznThe calculation formula of (2) is as follows:
Figure BDA0002897893560000021
wherein C is the total number of laser points on the nth target detection frame,
Figure BDA0002897893560000022
detecting the motion flow direction of a single point on the nth target detection frame;
predicting the position of the next frame target-by-target detection frame as follows:
(Xpre,Ypre,Zpre)=(X,Y,Z)+(ΔX,ΔY,ΔZ)
Θpre=Θ+ΔΘ
in the formula (X)pre,Ypre,Zpre) Predicting the set of central three-dimensional coordinates of all target detection frames of the next frame, wherein (X, Y and Z) are the set of central three-dimensional coordinates of all target detection frames of the previous frame; (DeltaX, DeltaY, DeltaZ) is the three-dimensional coordinate offset Deltax of the centers of all target detection frames of the upper and lower framesn,Δyn,ΔznA set of (a); thetaprePredicting the set of direction angles of all target detection frames of the next frame, wherein theta is the set of direction angles of all target detection frames of the previous frame; Δ Θ is the angular offset Δ θ of all target detection framesnA set of (a);
and respectively transmitting codes of the actual detection results of the previous frame of each target which are successfully matched to the detection frame of the next frame of the same target to obtain multi-target tracking tracks between two frames.
Preferably, the method further comprises the following steps:
step S6: taking the frames 2 to N as the previous frame in sequence, and repeating the steps S3-S5 to obtain N multi-target tracking tracks between two frames;
step S7: sequentially connecting the multi-target tracking tracks between N two frames according to the sequence of the frame numbers to obtain the multi-target tracking tracks between the 1 st frame and the N +1 st frame;
n is an integer of 2 or more.
Preferably, the step S4 of performing data preprocessing on the original point cloud information includes the following steps:
step B1: removing the ground by fitting a plane normal vector;
step B2: and adjusting the field angle data of the laser radar according to the calibration relation between the laser and the camera.
Preferably, the obtaining of the pre-trained motion estimation module in step S4 includes the following steps:
step A1: synthesizing a disparity map and a depth map into a scene stream by adopting a Flyingthings3D standard data set, taking the scene stream as a label file, and extracting a part of data set and a part of label file to obtain a training set based on Flyingthings 3D;
step A2: synthesizing a scene flow by using a KITTI scene flow data segment and a disparity map, taking the scene flow as a label file, and extracting all data segments and all label files to obtain a training set based on the KITTI scene flow;
step A3: training the motion estimation module by using a training set based on Flyingthings3D, and updating iteration parameters of the module to make the output of the module converge to a first preset threshold value; and globally adjusting the motion estimation module by using a training set based on KITTI scene flow, and updating iteration parameters to make the network prediction error converge to a second preset threshold value.
Preferably, step a3 includes:
training a motion estimation module by using a training set based on Flyingthings3D, taking two adjacent frame RGB-D picture pairs corresponding to timestamps in the training set based on Flyingthings3D as input of network pre-training, updating iteration parameters of the module, and enabling the output of the module to converge to a first preset threshold; and globally adjusting the motion estimation module by using the training set based on the KITTI scene flow, taking two adjacent frames of laser point clouds corresponding to the timestamps in the training set based on the KITTI scene flow as the input of the globally adjusting training of the network, updating the iteration parameters, and converging the network prediction error to a second preset threshold value.
Preferably, the step S3 of detecting the detection result output by the front-end module includes:
B={bi|i=1…N}
b={c,x,y,z,l,w,h,θ}
wherein, B is the set of all target detection results in a frame of point cloud, BiThe method comprises the steps of obtaining a detection result of the ith target in a frame of point cloud, wherein N is the number of all detected targets in the frame of point cloud, b is the detection result of a single target in the frame of point cloud, c is the attribute of the target, and x, y and z are central three-dimensional coordinates of a detection frame; l, w and h are the length, width and height of the detection frame; θ is the direction angle of the detection frame.
Preferably, the attributes c of the target include: cars, pedestrians, riders.
Preferably, the calculation process of the FlowNet3D method in the step S4 includes:
encoding the geometric information and the color information of the upper and lower frame point clouds to obtain the high-dimensional characteristics of the upper and lower frame point clouds; the high-dimensional features are subjected to cascade coding to obtain inter-frame motion features; and calculating the motion characteristics of the frames through the processes of up-sampling and decoding to obtain the point-by-point motion information estimation of the upper frame and the lower frame.
Preferably, the lidar sensor of step S1 includes a 64-line lidar sensor.
Compared with the prior art, the invention has the following beneficial effects:
the invention introduces scene flow estimation based on learning into a three-dimensional multi-target tracking task for the first time, and updates the prediction of a small track by utilizing the motion consistency in a three-dimensional space, thereby avoiding the trouble of adjusting hyper-parameters and the problem of a constant motion model inherent in the traditional filter-based method.
According to the multi-target tracking method based on scene motion information, original point cloud information of a previous frame and an original point cloud information of a next frame obtained by a front-end detection module are input into a pre-trained motion estimation module, data preprocessing is carried out on the original point cloud information, and then point-by-point motion information estimation of the previous frame and the next frame is obtained through calculation by using a FlowNet3D method.
Before a motion estimation module calculates point-by-point motion information estimation of upper and lower frames, the method firstly carries out data preprocessing on original point cloud information, specifically comprises the steps of fitting a plane normal vector to remove the ground, adjusting the angle data of a laser radar view according to the calibration relation of laser and a camera and the like, and has at least 3-point benefits: (1) the real-time performance is strong, and the calculation cost is saved; (2) the method is not limited by the size of training data, has strong generalization and can process most of data; (3) the evaluation index (especially sAMOTA, AMOTA) obtained by the test is high, and the tracking effect is good.
Experiments performed on the KITTI MOT dataset show that the present invention has a competitive advantage over the most advanced existing methods and can be used to challenge scenarios that traditional methods are not typically able to use.
Drawings
Fig. 1 is a schematic flow chart of a multi-target tracking method based on scene motion information according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and the described embodiments are only some embodiments, but not all embodiments, of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
A multi-target tracking method based on scene motion information comprises the following steps:
step S1: constructing a multi-target tracking system; the multi-target tracking system comprises a detection front-end module, a motion estimation module and a motion tracking module; the detection front-end module comprises a 64-line laser radar sensor;
step S2: taking the 1 st frame as the previous frame, proceeding to steps S3 to S5;
step S3: the detection front-end module acquires scene information, calculates by adopting a PointRCNN method and outputs detection results of a previous frame and a next frame; the scene information comprises original point cloud information;
the detection result output by the detection front-end module comprises the following steps:
B={bi|i=1…N}
b={c,x,y,z,l,w,h,θ}
wherein, B is the set of all target detection results in a frame of point cloud, BiThe method comprises the steps of obtaining a detection result of the ith target in a frame of point cloud, wherein N is the number of all detected targets in the frame of point cloud, b is the detection result of a single target in the frame of point cloud, c is the attribute of the target, and x, y and z are central three-dimensional coordinates of a detection frame; l, w and h are the length, width and height of the detection frame; θ is the direction angle of the detection frame. The attributes c of the target include: cars, pedestrians, riders.
Step S4: inputting original point cloud information of a previous frame and a next frame obtained by a front-end detection module into a pre-trained motion estimation module, preprocessing the original point cloud information, and calculating by using a FlowNet3D method to obtain point-by-point motion information estimation of the previous frame and the next frame;
the method for obtaining the pre-trained motion estimation module comprises the following steps:
step A1: synthesizing a disparity map and a depth map into a scene stream by adopting a Flyingthings3D standard data set, taking the scene stream as a label file, and extracting a part of data set and a part of label file to obtain a training set based on Flyingthings 3D;
step A2: synthesizing a scene flow by using a KITTI scene flow data segment and a disparity map, taking the scene flow as a label file, and extracting all data segments and all label files to obtain a training set based on the KITTI scene flow;
step A3: training a motion estimation module by using a training set based on Flyingthings3D, taking two adjacent frame RGB-D picture pairs corresponding to timestamps in the training set based on Flyingthings3D as input of network pre-training, updating iteration parameters of the module, and enabling the output of the module to converge to a first preset threshold; and globally adjusting the motion estimation module by using the training set based on the KITTI scene flow, taking two adjacent frames of laser point clouds corresponding to the timestamps in the training set based on the KITTI scene flow as the input of the globally adjusting training of the network, updating the iteration parameters, and converging the network prediction error to a second preset threshold value.
The data preprocessing of the original point cloud information comprises the following steps:
step B1: removing the ground by fitting a plane normal vector;
step B2: and adjusting the field angle data of the laser radar according to the calibration relation between the laser and the camera.
The calculation process of the FlowNet3D method comprises the following steps:
encoding the geometric information and the color information of the upper and lower frame point clouds to obtain the high-dimensional characteristics of the upper and lower frame point clouds; the high-dimensional features are subjected to cascade coding to obtain inter-frame motion features; and calculating the motion characteristics of the frames through the processes of up-sampling and decoding to obtain the point-by-point motion information estimation of the upper frame and the lower frame.
Step S5: estimating and inputting the point cloud information of the previous frame after the preprocessing of the motion estimation module and the point-by-point motion information obtained by the motion estimation module into the motion tracking module, and performing mean shift (delta x) of the target-by-target detection framen,Δyn,Δzn,Δθn) Calculating;
Δθnis the n-thAngular offset of individual target detection frame, Delta thetanCalculating by adopting a constant angular velocity model; Δ xn,Δyn,ΔznIs the central three-dimensional coordinate offset, Δ x, of the nth target detection framen,Δyn,ΔznThe calculation formula of (2) is as follows:
Figure BDA0002897893560000061
wherein C is the total number of laser points on the nth target detection frame,
Figure BDA0002897893560000062
detecting the motion flow direction of a single point on the nth target detection frame;
predicting the position of the next frame target-by-target detection frame as follows:
(Xpre,Ypre,Zpre)=(X,Y,Z)+(ΔX,ΔY,ΔZ)
Θpre=Θ+ΔΘ
in the formula (X)pre,Ypre,Zpre) Predicting the set of central three-dimensional coordinates of all target detection frames of the next frame, wherein (X, Y and Z) are the set of central three-dimensional coordinates of all target detection frames of the previous frame; (DeltaX, DeltaY, DeltaZ) is the three-dimensional coordinate offset Deltax of the centers of all target detection frames of the upper and lower framesn,Δyn,ΔznA set of (a); thetaprePredicting the set of direction angles of all target detection frames of the next frame, wherein theta is the set of direction angles of all target detection frames of the previous frame; Δ Θ is the angular offset Δ θ of all target detection framesnA set of (a);
and respectively transmitting codes of the actual detection results of the previous frame of each target which are successfully matched to the detection frame of the next frame of the same target to obtain multi-target tracking tracks between two frames.
Step S6: taking the frames 2 to N as the previous frame in sequence, and repeating the steps S3-S5 to obtain N multi-target tracking tracks between two frames;
step S7: sequentially connecting the multi-target tracking tracks between N two frames according to the sequence of the frame numbers to obtain the multi-target tracking tracks between the 1 st frame and the N +1 st frame;
n is an integer greater than or equal to 2;
according to the method, most points in each frame of point cloud data belong to the ground, and the ground in a KITTIMOT data set mostly belongs to a horizontal plane, so that the ground can be fitted and removed by calculating the maximum plane normal vector. The benefit of this method versus fitting the ground by a learning method is: (1) the real-time performance is strong, and the calculation cost is saved; (2) the method is not limited by the size of training data, has strong generalization and can process most of data; (3) the evaluation index (especially sAMOTA, AMOTA) obtained by the test is high, and the tracking effect is good.
TABLE 1 comparison of the method with the mmMOT, FANTrack, AB3DMOT method under KITTI MOT Standard dataset (10Hz)
Method Algorithm Data form sAMOTA AMOTA MOTA AMOTP
mmMOT Study of 2D+3D 63.91 24.91 51.91 67.32
FANTrack Study of 2D+3D 62.72 24.71 49.19 66.06
AB3DMOT Filtering 3D 69.81 27.26 57.06 67.00
Method for producing a composite material Mixing 3D 74.37 29.78 63.53 67.03
The mmMOT, FANTrack and AB3DMOT method has superior performance in the current multi-target tracking field, the mmMOT uses a ready-made detector which comprises information of images and point clouds, data association is carried out through fusion of multi-mode features and learning of an adjacency matrix between objects, global optimization is carried out through linear programming, and the method belongs to an off-line method in multi-target tracking research. The method belongs to an on-line method. The FANTrack designs a deep association network, so that the deep association network can use a neural network to replace the traditional Hungarian algorithm to complete data association work. The AB3DMOT is a traditional three-dimensional multi-target tracking method based on a kalman filter, and although the method has impressive performance in terms of both accuracy and operation speed, a significant disadvantage is that the method only focuses on the bounding box of the detection result, and ignores the motion continuity inside the point cloud. Furthermore, due to the hand-made motion model, the kalman filter needs to adjust the hyper-parameters frequently, which is sensitive to the properties of the scene, such as frame rate. Table 1 shows the results of a comparison of the present method and the above method. The evaluation indexes adopt sAMOTA, AMOTA, AMOTP and MOTA which are recognized in the field of multi-target tracking. When the category of the 'vehicle' in the KITTI MOT standard data set (10Hz) is tracked and the judgment threshold value is 0.7, the method is superior to a learning method and a traditional filtering method on important indexes (sAMOTA and MOTA). In particular, analog high speed data (5Hz) can be obtained by down-sampling the KITTI MOT standard data set to halve the data set frame rate. On the data, the method can maintain the tracking effect, and the prior method fails more. Table 2 shows the effect of the method compared to AB3DMOT in tracking the "car" category in the KITTI MOT standard dataset (5Hz) and evaluating a threshold of 0.7. It can be seen that of the 4 indices, the method performed well.
TABLE 2 comparison of the method with the AB3DMOT method under KITTI MOT Standard data set (5Hz)
Method Data form sAMOTA AMOTA AMOTP MOTA
AB3DMOT 3D 56.71 18.96 58.00 45.25
Method for producing a composite material 3D 72.42 27.89 64.82 60.03
The foregoing has outlined rather broadly the preferred embodiments and principles of the present invention and it will be appreciated that those skilled in the art may devise variations of the present invention that are within the spirit and scope of the appended claims.

Claims (9)

1. A multi-target tracking method based on scene motion information is characterized by comprising the following steps:
step S1: constructing a multi-target tracking system; the multi-target tracking system comprises a detection front-end module, a motion estimation module and a motion tracking module; the detection front-end module comprises a laser radar sensor;
step S2: taking the 1 st frame as the previous frame, proceeding to steps S3 to S5;
step S3: the detection front-end module acquires scene information, calculates by adopting a PointRCNN method and outputs detection results of a previous frame and a next frame; the scene information comprises original point cloud information;
step S4: inputting original point cloud information of a previous frame and a next frame obtained by a front-end detection module into a pre-trained motion estimation module, preprocessing the original point cloud information, and calculating by using a FlowNet3D method to obtain point-by-point motion information estimation of the previous frame and the next frame;
step S5: estimating and inputting the point cloud information of the previous frame after the preprocessing of the motion estimation module and the point-by-point motion information obtained by the motion estimation module into the motion tracking module, and performing mean shift (delta x) of the target-by-target detection framen,Δyn,Δzn,Δθn) Calculating;
Δθnis the angular offset, Δ θ, of the nth target detection framenCalculating by adopting a constant angular velocity model; Δ xn,Δyn,ΔznIs the central three-dimensional coordinate offset, Δ x, of the nth target detection framen,Δyn,ΔznThe calculation formula of (2) is as follows:
Figure FDA0002897893550000011
wherein C is the total number of laser points on the nth target detection frame,
Figure FDA0002897893550000012
detecting the motion flow direction of a single point on the nth target detection frame;
predicting the position of the next frame target-by-target detection frame as follows:
(Xpre,Ypre,Zpre)=(X,Y,Z)+(ΔX,ΔY,ΔZ)
Θpre=Θ+ΔΘ
in the formula (X)pre,Ypre,Zpre) Predicting the set of central three-dimensional coordinates of all target detection frames of the next frame, wherein (X, Y and Z) are the set of central three-dimensional coordinates of all target detection frames of the previous frame; (Δ X, Δ Y, Δ Z) is the sum of the upper and lower framesOffset delta x of three-dimensional coordinate of center of target detection framen,Δyn,ΔznA set of (a); thetaprePredicting the set of direction angles of all target detection frames of the next frame, wherein theta is the set of direction angles of all target detection frames of the previous frame; Δ Θ is the angular offset Δ θ of all target detection framesnA set of (a);
and respectively transmitting codes of the actual detection results of the previous frame of each target which are successfully matched to the detection frame of the next frame of the same target to obtain multi-target tracking tracks between two frames.
2. The multi-target tracking method based on scene motion information as claimed in claim 1, further comprising the steps of:
step S6: taking the frames 2 to N as the previous frame in sequence, and repeating the steps S3-S5 to obtain N multi-target tracking tracks between two frames;
step S7: sequentially connecting the multi-target tracking tracks between N two frames according to the sequence of the frame numbers to obtain the multi-target tracking tracks between the 1 st frame and the N +1 st frame;
and N is an integer greater than or equal to 2.
3. The multi-target tracking method based on scene motion information as claimed in any one of claims 1 or 2, wherein the data preprocessing of the original point cloud information in step S4 includes the following steps:
step B1: removing the ground by fitting a plane normal vector;
step B2: and adjusting the field angle data of the laser radar according to the calibration relation between the laser and the camera.
4. The multi-target tracking method based on scene motion information as claimed in claim 3, wherein the obtaining step S4 of the pre-trained motion estimation module comprises the following steps:
step A1: synthesizing a disparity map and a depth map into a scene stream by adopting a Flyingthings3D standard data set, taking the scene stream as a label file, and extracting a part of data set and a part of label file to obtain a training set based on Flyingthings 3D;
step A2: synthesizing a Scene Flow by using a KITTI Scene Flow data segment and a disparity map, taking the Scene Flow as a label file, and extracting all data segments and all label files to obtain a training set based on the KITTI Scene Flow;
step A3: training the motion estimation module by using a training set based on Flyingthings3D, and updating iteration parameters of the module to make the output of the module converge to a first preset threshold value; and globally adjusting the motion estimation module by using a training set based on KITTI Scene Flow, and updating iteration parameters to make the network prediction error converge to a second preset threshold value.
5. The multi-target tracking method based on scene motion information as claimed in claim 4, wherein said step A3 includes:
training a motion estimation module by using a training set based on Flyingthings3D, taking two adjacent frame RGB-D picture pairs corresponding to timestamps in the training set based on Flyingthings3D as input of network pre-training, updating iteration parameters of the module, and enabling the output of the module to converge to a first preset threshold; and globally adjusting the motion estimation module by using a training set based on KITTI Scene Flow, taking two adjacent frames of laser point clouds corresponding to timestamps in the training set based on KITTI Scene Flow as the input of the global adjustment training of the network, updating iteration parameters, and converging the network prediction error to a second preset threshold value.
6. The multi-target tracking method based on scene motion information as claimed in claim 3, wherein the detection result output by the detection front-end module in step S3 includes:
B={bi|i=1…N}
b={c,x,y,z,l,w,h,θ}
wherein, B is the set of all target detection results in a frame of point cloud, BiThe method comprises the steps of obtaining a detection result of the ith target in a frame of point cloud, wherein N is the number of all detected targets in the frame of point cloud, b is the detection result of a single target in the frame of point cloud, c is the attribute of the target, and x, y and z are central three-dimensional coordinates of a detection frame; l, w and h are the length, width and height of the detection frame; θ is the direction angle of the detection frame.
7. The multi-target tracking method based on scene motion information as claimed in claim 6, wherein the attributes c of the targets comprise: cars, pedestrians, riders.
8. The multi-target tracking method based on scene motion information according to claim 3, wherein the calculation process of the FlowNet3D method in the step S4 comprises:
encoding the geometric information and the color information of the upper and lower frame point clouds to obtain the high-dimensional characteristics of the upper and lower frame point clouds; the high-dimensional features are subjected to cascade coding to obtain inter-frame motion features; and calculating the motion characteristics of the frames through the processes of up-sampling and decoding to obtain the point-by-point motion information estimation of the upper frame and the lower frame.
9. The multi-target tracking method based on scene motion information as claimed in claim 3, characterized in that: the lidar sensor in step S1 includes a 64-line lidar sensor.
CN202110047457.8A 2021-01-14 2021-01-14 Multi-target tracking method based on scene motion information Pending CN112862858A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110047457.8A CN112862858A (en) 2021-01-14 2021-01-14 Multi-target tracking method based on scene motion information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110047457.8A CN112862858A (en) 2021-01-14 2021-01-14 Multi-target tracking method based on scene motion information

Publications (1)

Publication Number Publication Date
CN112862858A true CN112862858A (en) 2021-05-28

Family

ID=76005719

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110047457.8A Pending CN112862858A (en) 2021-01-14 2021-01-14 Multi-target tracking method based on scene motion information

Country Status (1)

Country Link
CN (1) CN112862858A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113281718A (en) * 2021-06-30 2021-08-20 江苏大学 3D multi-target tracking system and method based on laser radar scene flow estimation
CN113450295A (en) * 2021-06-15 2021-09-28 浙江大学 Depth map synthesis method based on difference comparison learning
CN114025146A (en) * 2021-11-02 2022-02-08 浙江工商大学 Dynamic point cloud geometric compression method based on scene flow network and time entropy model
CN114137562A (en) * 2021-11-30 2022-03-04 合肥工业大学智能制造技术研究院 Multi-target tracking method based on improved global nearest neighbor
CN115127523A (en) * 2022-05-09 2022-09-30 湖南傲英创视信息科技有限公司 Heterogeneous processing panoramic detection and ranging system based on double-line cameras

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140300758A1 (en) * 2013-04-04 2014-10-09 Bao Tran Video processing systems and methods
CN110494863A (en) * 2018-03-15 2019-11-22 辉达公司 Determine autonomous vehicle drives free space
CN110717403A (en) * 2019-09-16 2020-01-21 国网江西省电力有限公司电力科学研究院 Face multi-target tracking method
CN110942449A (en) * 2019-10-30 2020-03-31 华南理工大学 Vehicle detection method based on laser and vision fusion
CN111626217A (en) * 2020-05-28 2020-09-04 宁波博登智能科技有限责任公司 Target detection and tracking method based on two-dimensional picture and three-dimensional point cloud fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140300758A1 (en) * 2013-04-04 2014-10-09 Bao Tran Video processing systems and methods
CN110494863A (en) * 2018-03-15 2019-11-22 辉达公司 Determine autonomous vehicle drives free space
CN110717403A (en) * 2019-09-16 2020-01-21 国网江西省电力有限公司电力科学研究院 Face multi-target tracking method
CN110942449A (en) * 2019-10-30 2020-03-31 华南理工大学 Vehicle detection method based on laser and vision fusion
CN111626217A (en) * 2020-05-28 2020-09-04 宁波博登智能科技有限责任公司 Target detection and tracking method based on two-dimensional picture and three-dimensional point cloud fusion

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113450295A (en) * 2021-06-15 2021-09-28 浙江大学 Depth map synthesis method based on difference comparison learning
CN113450295B (en) * 2021-06-15 2022-11-15 浙江大学 Depth map synthesis method based on difference comparison learning
CN113281718A (en) * 2021-06-30 2021-08-20 江苏大学 3D multi-target tracking system and method based on laser radar scene flow estimation
CN113281718B (en) * 2021-06-30 2024-03-22 江苏大学 3D multi-target tracking system and method based on laser radar scene flow estimation
CN114025146A (en) * 2021-11-02 2022-02-08 浙江工商大学 Dynamic point cloud geometric compression method based on scene flow network and time entropy model
CN114025146B (en) * 2021-11-02 2023-11-17 浙江工商大学 Dynamic point cloud geometric compression method based on scene flow network and time entropy model
CN114137562A (en) * 2021-11-30 2022-03-04 合肥工业大学智能制造技术研究院 Multi-target tracking method based on improved global nearest neighbor
CN114137562B (en) * 2021-11-30 2024-04-12 合肥工业大学智能制造技术研究院 Multi-target tracking method based on improved global nearest neighbor
CN115127523A (en) * 2022-05-09 2022-09-30 湖南傲英创视信息科技有限公司 Heterogeneous processing panoramic detection and ranging system based on double-line cameras
CN115127523B (en) * 2022-05-09 2023-08-11 湖南傲英创视信息科技有限公司 Heterogeneous processing panoramic detection and ranging system based on double-line camera

Similar Documents

Publication Publication Date Title
CN112862858A (en) Multi-target tracking method based on scene motion information
CN111337941B (en) Dynamic obstacle tracking method based on sparse laser radar data
CN108152831B (en) Laser radar obstacle identification method and system
Jung et al. A lane departure warning system using lateral offset with uncalibrated camera
CN111932580A (en) Road 3D vehicle tracking method and system based on Kalman filtering and Hungary algorithm
CN109459750A (en) A kind of more wireless vehicle trackings in front that millimetre-wave radar is merged with deep learning vision
Erbs et al. Moving vehicle detection by optimal segmentation of the dynamic stixel world
CN110738690A (en) unmanned aerial vehicle video middle vehicle speed correction method based on multi-target tracking framework
CN104700414A (en) Rapid distance-measuring method for pedestrian on road ahead on the basis of on-board binocular camera
CN110197173B (en) Road edge detection method based on binocular vision
CN114998276B (en) Robot dynamic obstacle real-time detection method based on three-dimensional point cloud
CN103617636A (en) Automatic video-target detecting and tracking method based on motion information and sparse projection
CN114049382A (en) Target fusion tracking method, system and medium in intelligent network connection environment
CN114495064A (en) Monocular depth estimation-based vehicle surrounding obstacle early warning method
CN115861968A (en) Dynamic obstacle removing method based on real-time point cloud data
CN110176022B (en) Tunnel panoramic monitoring system and method based on video detection
CN114842340A (en) Robot binocular stereoscopic vision obstacle sensing method and system
CN115908539A (en) Target volume automatic measurement method and device and storage medium
Qing et al. A novel particle filter implementation for a multiple-vehicle detection and tracking system using tail light segmentation
CN113487631B (en) LEGO-LOAM-based adjustable large-angle detection sensing and control method
Wang et al. Geometry constraints-based visual rail track extraction
US20210304518A1 (en) Method and system for generating an environment model for positioning
Fakhfakh et al. Weighted v-disparity approach for obstacles localization in highway environments
Lu et al. Vision-based real-time road detection in urban traffic
Vella et al. Improved detection for wami using background contextual information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210528

RJ01 Rejection of invention patent application after publication