CN113985445A - 3D target detection algorithm based on data fusion of camera and laser radar - Google Patents

3D target detection algorithm based on data fusion of camera and laser radar Download PDF

Info

Publication number
CN113985445A
CN113985445A CN202110972750.5A CN202110972750A CN113985445A CN 113985445 A CN113985445 A CN 113985445A CN 202110972750 A CN202110972750 A CN 202110972750A CN 113985445 A CN113985445 A CN 113985445A
Authority
CN
China
Prior art keywords
point cloud
target
camera
laser radar
cloud data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110972750.5A
Other languages
Chinese (zh)
Inventor
安旭阳
苏治宝
杜志岐
李兆冬
陈佳琪
宋威龙
余雪玮
李冀川
索旭东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intelligent Mobile Robot Zhongshan Research Institute
China North Vehicle Research Institute
Original Assignee
Intelligent Mobile Robot Zhongshan Research Institute
China North Vehicle Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intelligent Mobile Robot Zhongshan Research Institute, China North Vehicle Research Institute filed Critical Intelligent Mobile Robot Zhongshan Research Institute
Priority to CN202110972750.5A priority Critical patent/CN113985445A/en
Publication of CN113985445A publication Critical patent/CN113985445A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/93Lidar systems specially adapted for specific applications for anti-collision purposes
    • G01S17/931Lidar systems specially adapted for specific applications for anti-collision purposes of land vehicles
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/86Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Electromagnetism (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Optical Radar Systems And Details Thereof (AREA)

Abstract

According to the 3D target detection algorithm based on the data fusion of the camera and the laser radar, a YOLOv5 model is selected as a 2D target detector, and target detection and tracking are carried out on a 2D image layer to obtain a 2D boundary frame for target detection; constructing a mapping relation between image data acquired by a camera and point cloud data acquired by a laser radar according to the calibration principle of the camera and the laser radar, and synchronizing the image data acquired by the camera and the point cloud data acquired by the laser radar; taking point cloud data acquired by a laser radar as input, and partitioning background point cloud and foreground point cloud in the point cloud data by adopting a ground partitioning algorithm and a point cloud cutting technology; and projecting the segmented foreground point cloud to a camera imaging plane according to the mapping relation between the image data and the point cloud data, taking the foreground point cloud in the 2D boundary frame of the target as a point cloud candidate area of the target, and extracting the 3D boundary frame of the target through Euclidean clustering to realize the 3D detection of the target. 3D target detection in a dynamic scene can be achieved on the premise of not depending on the position of a guiding person.

Description

3D target detection algorithm based on data fusion of camera and laser radar
Technical Field
The invention belongs to the technical field of ground unmanned platform environment perception, and particularly relates to a 3D target detection algorithm based on data fusion of a camera and a laser radar.
Background
The ground unmanned platform needs to follow a guide person to run in an actual application scene, a local task path can be planned in real time and an obstacle can be avoided according to the position of the guide person in the period, and the following capability of the ground unmanned platform is determined according to whether a vehicle-mounted perception algorithm can accurately output the position of the guide person. The current image-based target detection algorithm can only provide two-dimensional information and cannot be used for autonomous navigation driving of a ground unmanned platform, and binocular stereo vision is adopted, so that depth information can be perceived, the perception distance is limited and the precision is low. The laser radar is used as another typical sensor of the ground unmanned platform, scene outline and position information is provided, a series of three-dimensional convolution processing can be carried out on original point cloud of the laser radar or after the original point cloud is converted into a regular representation mode, target features are extracted, the calculation real-time performance is poor, and the high real-time performance requirement of the ground unmanned platform cannot be met. Furthermore, there are two main problems with lidar-based personnel detection: 1) the guide staff has smaller scale, and particularly the point cloud in a remote area is sparse; 2) the act of guiding the person is unpredictable and susceptible to flowers and trees in the background.
Disclosure of Invention
The invention overcomes one of the defects of the prior art, provides a 3D target detection algorithm based on data fusion of a camera and a laser radar, and realizes 3D target detection in a dynamic scene.
According to an aspect of the present disclosure, the present invention provides a 3D target detection algorithm based on camera and lidar data fusion, the algorithm comprising:
selecting a YOLOv5 model as a 2D target detector, and detecting and tracking a target in a 2D image layer to obtain a 2D boundary frame for target detection;
constructing a mapping relation between image data acquired by a camera and point cloud data acquired by a laser radar according to the calibration principle of the camera and the laser radar, and synchronizing the image data acquired by the camera and the point cloud data acquired by the laser radar;
taking the point cloud data collected by the laser radar as input, and partitioning the background point cloud and the foreground point cloud in the point cloud data by adopting a ground partitioning algorithm and a point cloud cutting technology;
and projecting the segmented foreground point cloud to a camera imaging plane according to the mapping relation between the image data and the point cloud data, taking the foreground point cloud in the 2D boundary frame of the target as a point cloud candidate area of the target, and extracting a 3D boundary frame of the target through Euclidean clustering to realize 3D detection of the target.
In a possible implementation manner, the detecting and tracking of the target in the 2D image layer to obtain the 2D bounding box for target detection includes:
constructing a target constant-speed motion model;
according to the target constant-speed motion model, fusing motion characteristics and frame characteristic information of a current frame and an adjacent next frame of the target, and calculating the similarity of a predicted track and a detected track of the target;
and performing correlation matching on the target by adopting a Hungarian algorithm according to the direction similarity of the predicted track and the detected track of the target, realizing detection and tracking of the target on a 2D image layer, and obtaining a 2D bounding box for target detection.
In one possible implementation, synchronizing the image data collected by the camera and the point cloud data collected by the lidar includes:
and taking the sampling time of the laser radar as a synchronous time node, and performing linear interpolation on the 2D bounding box of the target detection acquired based on the camera.
In one possible implementation manner, taking point cloud data acquired by the laser radar as input, segmenting a background point cloud and a foreground point cloud in the point cloud data, including:
performing dimensionality reduction on the point cloud data by adopting a voxel filtering algorithm;
calculating the included angle between each point cloud data and the X axis of the laser radar and the horizontal distance r of the center point of the laser radar, and dividing the point cloud data into fan-shaped areas according to the horizontal angular resolution of the laser radar;
dividing each sector area into a plurality of sub-areas in the radial direction according to the horizontal distance r, and orderly storing the point cloud data in a ray form;
and performing straight line fitting on the adjacent two-point cloud data of the same ray, wherein if the slope of the straight line meets the local slope threshold of the adjacent two-point cloud data, the adjacent two-point cloud data is background point cloud, and otherwise, the adjacent two-point cloud data is foreground point cloud, so that the background point cloud and the foreground point cloud in the point cloud data are segmented.
In one possible implementation, the similarity between the predicted trajectory and the detected trajectory of the target includes: the direction similarity and the position similarity of the predicted trajectory and the detected trajectory of the target.
According to the 3D target detection algorithm based on the data fusion of the camera and the laser radar, a YOLOv5 model is selected as a 2D target detector, and target detection and tracking are carried out on a 2D image layer to obtain a 2D boundary frame for target detection; constructing a mapping relation between image data acquired by a camera and point cloud data acquired by a laser radar according to the calibration principle of the camera and the laser radar, and synchronizing the image data acquired by the camera and the point cloud data acquired by the laser radar; taking the point cloud data collected by the laser radar as input, and partitioning the background point cloud and the foreground point cloud in the point cloud data by adopting a ground partitioning algorithm and a point cloud cutting technology; and projecting the segmented foreground point cloud to a camera imaging plane according to the mapping relation between the image data and the point cloud data, taking the foreground point cloud in the 2D boundary frame of the target as a point cloud candidate area of the target, and extracting a 3D boundary frame of the target through Euclidean clustering to realize 3D detection of the target. 3D target detection in a dynamic scene can be achieved on the premise of not depending on the position of a guiding person.
Drawings
The accompanying drawings are included to provide a further understanding of the technology or prior art of the present application and are incorporated in and constitute a part of this specification. The drawings expressing the embodiments of the present application are used for explaining the technical solutions of the present application, and should not be construed as limiting the technical solutions of the present application.
FIG. 1 shows a flow diagram of a 3D target detection algorithm based on camera and lidar data fusion according to an embodiment of the disclosure;
FIG. 2 shows a schematic structural diagram of the YOLOv5 model according to an embodiment of the present disclosure;
FIG. 3 shows a further defined flowchart of step S1 according to an embodiment of the present disclosure;
FIG. 4 illustrates a flow diagram of target cascade matching according to an embodiment of the present disclosure;
FIG. 5a shows a schematic diagram of laser radar and camera synchronization according to an embodiment of the present disclosure;
FIG. 5b shows a schematic diagram of laser radar and camera synchronization according to another embodiment of the present disclosure;
fig. 6 shows a schematic diagram of transverse and radial cuts of a laser point cloud according to another embodiment of the present disclosure.
Detailed Description
The following detailed description of the embodiments of the present invention will be provided with reference to the accompanying drawings and examples, so that how to apply the technical means to solve the technical problems and achieve the corresponding technical effects can be fully understood and implemented. The embodiments and the features of the embodiments can be combined without conflict, and the technical solutions formed are all within the scope of the present invention.
Additionally, the steps illustrated in the flow charts of the figures may be performed in a computer such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.
Fig. 1 shows a flow chart of a 3D target detection algorithm based on camera and lidar data fusion according to an embodiment of the disclosure. As shown in fig. 1, the algorithm may include:
step S1: and selecting a YOLOv5 model as a 2D object detector, and detecting and tracking the object in a 2D image layer to obtain a 2D boundary frame for object detection.
Wherein the 2D image layer may be an image data layer acquired with a camera.
Fig. 2 shows a schematic structural diagram of the YOLOv5 model according to an embodiment of the present disclosure.
As shown in fig. 2, the YOLOv5 model structure can be used to extract target features by using a backbone Network layer of the YOLOv5 model, which contains a Focus structure and a CSP1 structure, perform multi-scale Feature fusion by using a Feature Pyramid Network (FPN) and a Perceptual countermeasure Network (PAN), and finally suppress a target frame with low reliability by using a DIOU _ NMS, thereby realizing detection of a category attribute and a bounding box of a target in a scene.
FIG. 3 shows a further defined flowchart of step S1 according to an embodiment of the present disclosure.
In an example, as shown in fig. 3, the detecting and tracking of the target in the 2D image layer, and obtaining the 2D bounding box for target detection may include:
step S11: and constructing a target constant-speed motion model.
For example, suppose that a person is guided to move at a constant speed in a ground unmanned platform application scene, and a target constant-speed movement model is constructed.
Defining the position of a bounding box of the target to be tracked at the moment of t-1 as (c)x,cyW, h), the states can be represented by 8-dimensional vectors
Figure BDA0003226495870000041
Predicting the position of the target to be tracked at the moment t through a standard Kalman filter
Figure BDA0003226495870000042
As shown in formulas (1) and (2):
Figure BDA0003226495870000051
P’=FPt-1FT+ Q formula (2),
in the formula, Pt-1Estimating covariance for t-1 time posteriori, representing state uncertainty, P' is the prior estimated covariance for t-1 time, F is the state transition matrix, Q is the process excitation noise covariance,
Figure BDA0003226495870000052
is an estimated value of the posterior state at the time t-1,
Figure BDA0003226495870000053
is the prior state estimated value at the time t.
Step S12: and predicting the track of the next frame of the target by using a Kalman filtering algorithm according to the target constant-speed motion model, fusing the motion characteristics and frame characteristic information of the current frame of the target and the adjacent next frame of the current frame of the target, and calculating the similarity of the predicted track and the detected track of the target. The similarity between the predicted track and the detected track of the target comprises the following steps: the direction similarity and the position similarity of the predicted trajectory and the detected trajectory of the target. For example, appearance (frame) features of a target can be extracted by using a residual network model trained offline on a pedestrian re-recognition data set, the appearance (frame) features are represented by 128-dimensional normalized feature descriptors, and the direction similarity of a predicted track and a detected track is measured by adopting cosine distance; and measuring the position similarity of the target by calculating the Mahalanobis distance of the predicted track and the detected track.
Step S13: and performing correlation matching on the target by adopting a Hungarian algorithm according to the direction similarity of the predicted track and the detected track of the target, realizing the detection and tracking of the target on a 2D image layer, and obtaining a 2D bounding box for target detection.
FIG. 4 shows a flow diagram of target cascade matching according to an embodiment of the present disclosure.
For example, if a new target appears in the dynamic scene, a corresponding tracker is assigned for the newly appearing target. Each tracker sets a parameter time _ update as the priority of cascade matching, if the current frame matching is successful, the time _ update is set to be 0, otherwise, 1 is added. As shown in fig. 4, when the target cascade matching is performed subsequently, the parameter time _ update is small for the priority matching, and the parameter time _ update is large for the post-matching.
In order to improve the effect of cascade matching of the target to be detected, the Intersection over Union (IoU) of the target track of the next frame and the detection bounding box of the target track of the current frame can be calculated and predicted at the same time, and the Hungarian algorithm is adopted to continue to carry out association matching and allocate corresponding IDs. Finally according to the actual detection value z at the moment tt=[cx,cy,w,h]Updating the position of the target to be tracked, as shown in formulas (3), (4) and (5):
Figure BDA0003226495870000061
Figure BDA0003226495870000062
Pt=(I-KtH) p' is a group represented by the formula (5),
wherein H is a conversion matrix from a state variable to an observed variable, and R is a measurement noise covariance
Difference, ZtAs an observed value, KtIn order to be the basis of the kalman gain,
Figure BDA0003226495870000063
is an estimated value of the posterior state at the time t. Tong (Chinese character of 'tong')
The 2D bounding box for detecting the target to be detected can be obtained through the steps.
Step S2: and constructing a mapping relation between image data acquired by the camera and point cloud data acquired by the laser radar according to the calibration principle of the camera and the laser radar, and synchronizing the image data acquired by the camera and the point cloud data acquired by the laser radar. The sampling time of the laser radar can be used as a synchronous time node, and linear interpolation can be performed on the 2D bounding box based on the target detection acquired by the camera.
For example, firstly, a camera imaging model and a distortion model are constructed according to the pinhole imaging principle, and are specifically shown in formulas 6-8:
Figure BDA0003226495870000064
wherein f is the focal length of the camera; (fx, fy) are equivalent focal lengths in the x and y directions, respectively; m1、M2Respectively, an internal reference matrix and an external reference matrix of the camera.
Figure BDA0003226495870000065
Wherein (x, y) is the coordinate point of the target actual position, r is the distance from the target image point to the central point, and k1、k2And k3Is the radial distortion coefficient.
Figure BDA0003226495870000066
In the formula, p1And p3Is the tangential distortion coefficient.
For example, a checkerboard with a grid size of 50mm × 50mm and 9x7 may be selected as the internal reference calibration plate, the internal reference calibration plate is moved to the leftmost, rightmost, uppermost and lowermost sides of the camera's field of view, the distance and the tilt angle between the camera and the internal reference calibration plate are adjusted, 25 pictures are collected, and the equivalent focal length (fx, fy) and the radial distortion coefficient (k) in the camera imaging model and the distortion model are measured by the least square method1,k2,k3) And tangential distortion coefficient (p)1,p3) And (6) solving.
Then selecting a checkerboard with the grid size of 108mm x 108mm and 9x7 as an external reference calibration board, sequentially moving the external reference calibration board to 9 positions of the near side, the middle side and the far side of the visual field of the camera, adjusting the posture of the external reference calibration board at each position, wherein the posture comprises five postures of upward pitching, downward pitching, left deviation, right deviation and center, each posture lasts for 3 seconds, ensuring that the external reference calibration board does not generate a fuzzy phenomenon, and selecting effective 20 frames of point cloud frames and image frames. And (3) utilizing the position and the direction of each frame of checkerboard captured by the AUTOWARE toolbox in a camera coordinate system, simultaneously extracting point cloud data on the checkerboard, and calculating a rotation matrix and a translation matrix of the camera relative to the laser radar according to a formula (9).
Figure BDA0003226495870000071
Where (α, β, γ) is the rotational euler angle of the camera to the lidar; r (α, β, γ) is the camera-to-lidar rotation matrix; (x, y, z) is the camera to lidar translation value; t (x, y, z) is a translation matrix of the camera to the lidar; p is a radical ofiThe position of the ith checkerboard in a camera coordinate system; n isiThe normal vector of the ith checkerboard in a camera coordinate system; q. q.si,jIs the position of the j point of the ith checkerboard in the laser radar coordinate system.
In an example, synchronizing the camera acquired image data and lidar acquired point cloud data may include:
and taking the sampling time of the laser radar as a synchronous time node, and performing linear interpolation on the 2D bounding box of the target detection acquired based on the camera.
Fig. 5a and 5b show schematic diagrams of lidar and camera synchronization, respectively, according to an embodiment of the disclosure.
The sampling time node of the laser radar sensor is selected as a synchronous insertion point, and as the position of a target in an image acquired by the camera is continuously changed, the image data of the camera sensor can be obtained by interpolation according to the time node of the laser radar. For example, the laser radar and the camera sampling data may be sequentially placed in corresponding buffers, and the timestamp of the laser radar sampling node is used as the synchronization time, and the camera data buffer is searched for data adjacent to the synchronization time. As shown in fig. 5a, if the synchronization time synctime is less than the most forward time of the camera data buffer, the data synchronization is considered to be failed, the current lidar point cloud frame is discarded, and the next lidar data frame is waited. If the synchronization time Syntime is greater than the timestamp corresponding to the buf [1] of the camera buffer, the current buffer is considered to have enough data for synchronization, the previous data in the camera buffer is discarded circularly, and the subsequent data move forward in sequence until the data synchronization is successful.
As shown in FIG. 5b, if the synchronization time Syntime is already between buf [0] and buf [1] of the camera buffer, but the time difference between the synchronization time Syntime and buf [0] or buf [1] is greater than the maximum inference time threshold for camera target identification, it can be considered that the camera buffer has data loss, the current lidar point cloud frame and buf [0] data of the camera buffer are discarded, the subsequent data moves forward, and the next frame of lidar data is waited for synchronization. The 2D bounding box based on camera sensor target detection is finally linearly interpolated, as shown in equation 10,
Figure BDA0003226495870000081
in the formula, x is the coordinates of the upper left pixel and the lower right pixel of the 2D bounding box; time is the sampling time of the camera buffer; syncbuf is the result of data synchronization of the camera and the laser radar.
Step S3: and taking the point cloud data acquired by the laser radar as input, and partitioning the background point cloud and the foreground point cloud in the point cloud data by adopting a ground partitioning algorithm and a point cloud cutting technology.
Specifically, a voxel filtering algorithm is adopted to perform dimension reduction processing on the point cloud data. For example, voxel filtering may be used to segment the input point cloud data into 0.2m × 0.2m cubes, and the centroid of each small cube represents all the point clouds in the small cube, so as to perform downsampling on the input point cloud. And because the target to be tracked usually moves in front of the ground unmanned platform, point cloud clipping is carried out on the point cloud after down sampling, and the point cloud outside the road range and behind the ground unmanned platform is removed. In addition, the point cloud of the laser radar in a close range is filtered, and the reflection influence of the laser radar of the ground unmanned platform is eliminated. Then, the ground segmentation algorithm is adopted to convert unordered input point cloud into ordered output, the dimension of points (x, y, z) is reduced into (x, y), the included angle alpha between each point cloud data and the laser radar x axis and the horizontal distance r of the laser radar central point are calculated according to the formulas (11) and (12),
Figure BDA0003226495870000091
Figure BDA0003226495870000092
fig. 6 shows a schematic diagram of transverse and radial cuts of a laser point cloud according to another embodiment of the present disclosure. As shown in fig. 6, the point cloud is divided into sector areas according to the horizontal angular resolution of the laser radar, and each sector area is divided into a plurality of sub-areas according to the distance r in the radial direction, so that the point cloud is stored in order in the form of rays. And introducing a local gradient threshold value of two adjacent points of the same ray, performing straight line fitting on the two adjacent point cloud data of the same ray, if the straight line slope meets the local gradient threshold value of the two adjacent point cloud data, the two adjacent point cloud data is background point cloud (ground point cloud), and if not, the two adjacent point cloud data is foreground point cloud (non-ground point cloud), and realizing the segmentation of the background point cloud and the foreground point cloud in the point cloud data, namely completing the segmentation of the ground point cloud data and the non-ground point cloud data of the point cloud data.
Step S4: and projecting the segmented foreground point cloud to a camera imaging plane according to the mapping relation between the image data and the point cloud data, taking the foreground point cloud in the 2D boundary frame of the target as a point cloud candidate area of the target, and extracting a 3D boundary frame of the target through Euclidean clustering to realize 3D detection of the target.
Specifically, a mapping relation from a three-dimensional point cloud coordinate in a laser radar coordinate system to an image point in an image pixel coordinate system is established through the combined calibration of the laser radar and the camera in the step S2, the segmented foreground point cloud (non-ground point cloud) is projected and a data index is constructed, and the three-dimensional point cloud information is unified to the image coordinate. And temporarily assuming the point cloud falling in the 2D boundary frame of the target detection after projection as a spatial position area of the target to be tracked, carrying out Euclidean clustering on the point cloud in the current area, and simultaneously adopting a kd-tree algorithm to accelerate the data search rate so as to reduce the calculated amount. The category of the cluster center closest to the center of the 2D boundary frame of the target detection can be regarded as the point cloud of the area range of the target to be tracked, the boundary of the point cloud is the 3D boundary frame of the target to be detected, and the center position coordinate of the target to be tracked can be effectively calculated according to the mapping relation between the 2D boundary frame and the 3D boundary frame of the target detection, so that the 3D target detection is realized.
According to the 3D target detection algorithm based on the data fusion of the camera and the laser radar, a YOLOv5 model is selected as a 2D target detector, and target detection and tracking are carried out on a 2D image layer to obtain a 2D boundary frame for target detection; constructing a mapping relation between image data acquired by a camera and point cloud data acquired by a laser radar according to the calibration principle of the camera and the laser radar, and synchronizing the image data acquired by the camera and the point cloud data acquired by the laser radar; taking the point cloud data collected by the laser radar as input, and partitioning the background point cloud and the foreground point cloud in the point cloud data by adopting a ground partitioning algorithm and a point cloud cutting technology; and projecting the segmented foreground point cloud to a camera imaging plane according to the mapping relation between the image data and the point cloud data, taking the foreground point cloud in the 2D boundary frame of the target as a point cloud candidate area of the target, and extracting a 3D boundary frame of the target through Euclidean clustering to realize 3D detection of the target. 3D target detection in a dynamic scene can be achieved on the premise of not depending on the position of a guiding person.
Although the embodiments of the present invention have been described above, the above descriptions are only for the convenience of understanding the present invention, and are not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (5)

1. A 3D target detection algorithm based on camera and lidar data fusion, the algorithm comprising:
selecting a YOLOv5 model as a 2D target detector, and detecting and tracking a target in a 2D image layer to obtain a 2D boundary frame for target detection;
constructing a mapping relation between image data acquired by a camera and point cloud data acquired by a laser radar according to the calibration principle of the camera and the laser radar, and synchronizing the image data acquired by the camera and the point cloud data acquired by the laser radar;
taking the point cloud data collected by the laser radar as input, and partitioning the background point cloud and the foreground point cloud in the point cloud data by adopting a ground partitioning algorithm and a point cloud cutting technology;
and projecting the segmented foreground point cloud to a camera imaging plane according to the mapping relation between the image data and the point cloud data, taking the foreground point cloud in the 2D boundary frame of the target as a point cloud candidate area of the target, and extracting a 3D boundary frame of the target through Euclidean clustering to realize 3D detection of the target.
2. The 3D object detection algorithm according to claim 1, wherein the detecting and tracking the object in the 2D image layer to obtain the 2D bounding box for object detection comprises:
constructing a target constant-speed motion model;
predicting the track of the next frame of the target by using a Kalman filtering algorithm according to the target constant-speed motion model, fusing the motion characteristics and frame characteristic information of the current frame of the target and the adjacent next frame of the current frame of the target, and calculating the similarity of the predicted track and the detected track of the target;
and performing correlation matching on the target by adopting a Hungarian algorithm according to the direction similarity of the predicted track and the detected track of the target, realizing detection and tracking of the target on a 2D image layer, and obtaining a 2D bounding box for target detection.
3. The 3D target detection algorithm of claim 1, wherein synchronizing the camera acquired image data and lidar acquired point cloud data comprises:
and taking the sampling time of the laser radar as a synchronous time node, and performing linear interpolation on the 2D bounding box of the target detection acquired based on the camera.
4. The 3D target detection algorithm of claim 1, wherein the segmentation of the background point cloud and the foreground point cloud in the point cloud data using the point cloud data collected by the lidar as an input comprises:
performing dimensionality reduction on the point cloud data by adopting a voxel filtering algorithm;
calculating the included angle between each point cloud data and the X axis of the laser radar and the horizontal distance r of the center point of the laser radar, and dividing the point cloud data into fan-shaped areas according to the horizontal angular resolution of the laser radar;
dividing each sector area into a plurality of sub-areas in the radial direction according to the horizontal distance r, and orderly storing the point cloud data in a ray form;
and performing straight line fitting on the adjacent two-point cloud data of the same ray, wherein if the slope of the straight line meets the local slope threshold of the adjacent two-point cloud data, the adjacent two-point cloud data is background point cloud, and otherwise, the adjacent two-point cloud data is foreground point cloud, so that the background point cloud and the foreground point cloud in the point cloud data are segmented.
5. The 3D object detection algorithm of claim 2, wherein the similarity between the predicted trajectory and the detected trajectory of the object comprises: the direction similarity and the position similarity of the predicted trajectory and the detected trajectory of the target.
CN202110972750.5A 2021-08-24 2021-08-24 3D target detection algorithm based on data fusion of camera and laser radar Pending CN113985445A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110972750.5A CN113985445A (en) 2021-08-24 2021-08-24 3D target detection algorithm based on data fusion of camera and laser radar

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110972750.5A CN113985445A (en) 2021-08-24 2021-08-24 3D target detection algorithm based on data fusion of camera and laser radar

Publications (1)

Publication Number Publication Date
CN113985445A true CN113985445A (en) 2022-01-28

Family

ID=79735204

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110972750.5A Pending CN113985445A (en) 2021-08-24 2021-08-24 3D target detection algorithm based on data fusion of camera and laser radar

Country Status (1)

Country Link
CN (1) CN113985445A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114488073A (en) * 2022-02-14 2022-05-13 中国第一汽车股份有限公司 Method for processing point cloud data acquired by laser radar
CN114494248A (en) * 2022-04-01 2022-05-13 之江实验室 Three-dimensional target detection system and method based on point cloud and images under different visual angles
CN114565644A (en) * 2022-03-02 2022-05-31 湖南中科助英智能科技研究院有限公司 Three-dimensional moving object detection method, device and equipment
CN115019270A (en) * 2022-05-31 2022-09-06 电子科技大学 Automatic driving night target detection method based on sparse point cloud prior information
WO2022262594A1 (en) * 2021-06-15 2022-12-22 同方威视技术股份有限公司 Method and apparatus for following target, robot, and computer-readable storage medium
CN115546749A (en) * 2022-09-14 2022-12-30 武汉理工大学 Road surface depression detection, cleaning and avoidance method based on camera and laser radar
CN115861957A (en) * 2023-01-19 2023-03-28 中国科学技术大学 Novel dynamic object segmentation method based on sensor fusion
CN116071400A (en) * 2023-04-06 2023-05-05 浙江光珀智能科技有限公司 Target track tracking method based on laser radar equipment
CN116580066A (en) * 2023-07-04 2023-08-11 广州英码信息科技有限公司 Pedestrian target tracking method under low frame rate scene and readable storage medium
WO2023236872A1 (en) * 2022-06-09 2023-12-14 劢微机器人(深圳)有限公司 Unloading method based on fusion of radar and camera, and detection apparatus and storage medium
CN117557599A (en) * 2024-01-12 2024-02-13 上海仙工智能科技有限公司 3D moving object tracking method and system and storage medium

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022262594A1 (en) * 2021-06-15 2022-12-22 同方威视技术股份有限公司 Method and apparatus for following target, robot, and computer-readable storage medium
CN114488073A (en) * 2022-02-14 2022-05-13 中国第一汽车股份有限公司 Method for processing point cloud data acquired by laser radar
CN114565644A (en) * 2022-03-02 2022-05-31 湖南中科助英智能科技研究院有限公司 Three-dimensional moving object detection method, device and equipment
CN114494248A (en) * 2022-04-01 2022-05-13 之江实验室 Three-dimensional target detection system and method based on point cloud and images under different visual angles
CN115019270A (en) * 2022-05-31 2022-09-06 电子科技大学 Automatic driving night target detection method based on sparse point cloud prior information
CN115019270B (en) * 2022-05-31 2024-04-19 电子科技大学 Automatic driving night target detection method based on sparse point cloud priori information
WO2023236872A1 (en) * 2022-06-09 2023-12-14 劢微机器人(深圳)有限公司 Unloading method based on fusion of radar and camera, and detection apparatus and storage medium
CN115546749B (en) * 2022-09-14 2023-05-30 武汉理工大学 Pavement pothole detection, cleaning and avoiding method based on camera and laser radar
CN115546749A (en) * 2022-09-14 2022-12-30 武汉理工大学 Road surface depression detection, cleaning and avoidance method based on camera and laser radar
CN115861957B (en) * 2023-01-19 2023-06-16 中国科学技术大学 Novel dynamic object segmentation method based on sensor fusion
CN115861957A (en) * 2023-01-19 2023-03-28 中国科学技术大学 Novel dynamic object segmentation method based on sensor fusion
CN116071400A (en) * 2023-04-06 2023-05-05 浙江光珀智能科技有限公司 Target track tracking method based on laser radar equipment
CN116071400B (en) * 2023-04-06 2023-07-18 浙江光珀智能科技有限公司 Target track tracking method based on laser radar equipment
CN116580066A (en) * 2023-07-04 2023-08-11 广州英码信息科技有限公司 Pedestrian target tracking method under low frame rate scene and readable storage medium
CN116580066B (en) * 2023-07-04 2023-10-03 广州英码信息科技有限公司 Pedestrian target tracking method under low frame rate scene and readable storage medium
CN117557599A (en) * 2024-01-12 2024-02-13 上海仙工智能科技有限公司 3D moving object tracking method and system and storage medium
CN117557599B (en) * 2024-01-12 2024-04-09 上海仙工智能科技有限公司 3D moving object tracking method and system and storage medium

Similar Documents

Publication Publication Date Title
CN113985445A (en) 3D target detection algorithm based on data fusion of camera and laser radar
CN112435325B (en) VI-SLAM and depth estimation network-based unmanned aerial vehicle scene density reconstruction method
CN110097553B (en) Semantic mapping system based on instant positioning mapping and three-dimensional semantic segmentation
CN108564616B (en) Fast robust RGB-D indoor three-dimensional scene reconstruction method
CN112634451B (en) Outdoor large-scene three-dimensional mapping method integrating multiple sensors
CN111583369B (en) Laser SLAM method based on facial line angular point feature extraction
US10366538B2 (en) Method and device for illustrating a virtual object in a real environment
CN108898676B (en) Method and system for detecting collision and shielding between virtual and real objects
WO2022188094A1 (en) Point cloud matching method and apparatus, navigation method and device, positioning method, and laser radar
EP2874097A2 (en) Automatic scene parsing
CN108229416B (en) Robot SLAM method based on semantic segmentation technology
KR20210005621A (en) Method and system for use in coloring point clouds
Wang et al. An overview of 3d object detection
CN112419497A (en) Monocular vision-based SLAM method combining feature method and direct method
CN113223045A (en) Vision and IMU sensor fusion positioning system based on dynamic object semantic segmentation
CN110223380A (en) Fusion is taken photo by plane and the scene modeling method of ground multi-view image, system, device
CN113674400A (en) Spectrum three-dimensional reconstruction method and system based on repositioning technology and storage medium
CN116449384A (en) Radar inertial tight coupling positioning mapping method based on solid-state laser radar
CN111998862A (en) Dense binocular SLAM method based on BNN
CN112651994A (en) Ground multi-target tracking method
Ferguson et al. A 2d-3d object detection system for updating building information models with mobile robots
KR100574227B1 (en) Apparatus and method for separating object motion from camera motion
CN116597122A (en) Data labeling method, device, electronic equipment and storage medium
CN116147618B (en) Real-time state sensing method and system suitable for dynamic environment
WO2023030062A1 (en) Flight control method and apparatus for unmanned aerial vehicle, and device, medium and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination