CN114463303A - Road target detection method based on fusion of binocular camera and laser radar - Google Patents

Road target detection method based on fusion of binocular camera and laser radar Download PDF

Info

Publication number
CN114463303A
CN114463303A CN202210110972.0A CN202210110972A CN114463303A CN 114463303 A CN114463303 A CN 114463303A CN 202210110972 A CN202210110972 A CN 202210110972A CN 114463303 A CN114463303 A CN 114463303A
Authority
CN
China
Prior art keywords
target
bbox
camera
radar
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210110972.0A
Other languages
Chinese (zh)
Inventor
张炳力
潘泽昊
姜俊昭
刘文涛
张成标
程进
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202210110972.0A priority Critical patent/CN114463303A/en
Publication of CN114463303A publication Critical patent/CN114463303A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06T5/80
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/344Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10044Radar image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • G06T2207/30256Lane; Road marking

Abstract

The invention provides a road target detection method based on fusion of binocular cameras and laser radars, which comprises the steps of collecting front road target information by utilizing a left camera, a right camera and a laser radar; acquiring binocular parallax through a binocular stereo matching algorithm; acquiring image target category and two-dimensional position information by using a monocular vision-based neural network; combining binocular parallax and monocular vision detection information to obtain a front target vision three-dimensional detection result; acquiring a front target radar three-dimensional detection result through point cloud segmentation and clustering; performing Hungary algorithm optimization solution on the matching cost of the two three-dimensional enclosure frames, classifying based on matching results, adopting different fusion strategies, and finally outputting the road target information after supplementary correction. The detection framework of the invention realizes the advantage complementation of the sensor, and uses a target-level matching fusion strategy to output more accurate and reliable road target information.

Description

Road target detection method based on fusion of binocular camera and laser radar
Technical Field
The invention belongs to the field of intelligent vehicle environment perception, and particularly relates to a road target detection method based on fusion of a binocular camera and a laser radar.
Background
At the present stage, intelligent driving becomes a mainstream trend, and the accurate and efficient environment perception of the intelligent vehicle is a primary task for realizing advanced auxiliary driving and even automatic driving. Currently, the environment sensing technology of intelligent vehicles mainly depends on vehicle-mounted sensors such as vision and radar. The road target category information can be obtained through the visual image, but the monocular vision system cannot accurately obtain the object distance information; and the laser radar can provide three-dimensional information of road targets, but cannot accurately judge the target types.
In the existing solution, in order to realize advantage complementation of each sensor, a target detection method based on sensor fusion appears, but in the specific implementation design, only simple extension of multi-sensor detection information is utilized, and the accuracy and reliability of the fusion scheme need to be improved. No better method or technique for solving or improving the above problems has been seen.
Disclosure of Invention
Aiming at the defects or improvement requirements of the existing method, the invention aims to provide a road target detection method based on the fusion of a binocular camera and a laser radar, combines the vision sensing technology with the laser radar technology, fuses a YOLOv4 detection model, and finally fuses binocular vision and a radar detection result, thereby providing an intelligent, more accurate and reliable technical means for road target detection.
In order to solve the technical problem, the invention provides a road target detection method based on fusion of a binocular camera and a laser radar, which comprises the following steps:
step 1, collecting front road target information by using a left camera, a right camera and a laser radar;
step 2, obtaining binocular parallax information through a binocular stereo matching algorithm;
step 3, utilizing a visual target detection neural network based on YOLOv4 to acquire the category and two-dimensional position information of a target in the left camera image as left camera visual detection information;
step 4, acquiring a three-dimensional visual detection result of the front target by combining binocular parallax information and left camera visual detection information;
step 5, carrying out point cloud segmentation and clustering on the original point cloud obtained by the laser radar to obtain a front target radar detection result;
and 6, performing time and space registration on the vision and radar detection results, performing matching fusion on the front target detection results obtained by the vision and radar detection results, and outputting fused front target information.
In the above technical solution, further, the step 1 specifically includes: a binocular camera system is formed by a left camera and a right camera which are arranged in parallel, and left image information and right image information of a road target in front are respectively collected. Collecting point cloud information of a front road area through a laser radar; relative position protection of camera and radarAnd before the cameras and the laser radar are used for collecting road information, the left camera and the right camera are calibrated in a binocular mode, and the left camera and the laser radar are calibrated in a combined mode. The calibration steps are specifically as follows: obtaining internal parameters M of the left camera and the right camera by using a Zhangyingyou calibration method and an MABTLAB calibration tool according to images acquired by the left camera and the right cameraL1,MR1Distortion coefficient DL,DR. Selecting 4 pairs of 3D points in the laser radar point cloud and corresponding image pixel points thereof, and utilizing the 3D space point position P of the pointsLidar_iAnd the 2D image projection position p of the pointimg_iObtaining external parameter M between the left camera and the laser radar by adopting a PnP algorithmLR=[RLR|TLR]And finishing the combined calibration.
Further, the step 2 specifically includes: and performing stereo correction on the images acquired by the left camera and the right camera by using the internal and external parameters calibrated by the binocular cameras. And searching pixel points corresponding to the same target object of the left image and the right image through an SGBM algorithm, and generating parallax based on the pixel points of the left camera image. Acquiring two-dimensional point [ u ] of image pixel based on Bouguet algorithm by using binocular parallaxi,vi]TTo a three-dimensional point [ x ] of a camera coordinate systemi,yi,zi]TThe reprojection matrix Qreprojection
Further, the step 3 specifically includes: the method comprises the steps of collecting a target image data set in advance, labeling road target types and positions in the image, inputting the labeled data set serving as a training set into a YOLOv4 visual target detection neural network, and training out network prediction weights. Inputting the image acquired by the left camera into a YOLOv4 visual target detection neural network to obtain the class information class of the ith target in the imageiAnd its minimum two-dimensional bounding box 2d _ bboximg_i(ui,vi,wi,hi) Wherein u isiIs the ith target two-dimensional minimum bounding box 2d _ bbox in the imageimg_iLeft upper corner point abscissa, viIs the ordinate of the upper left corner point, wiIs the width of the smallest bounding box, hiIs the height of the smallest enclosing frame.
Further, the stepsThe step 4 is specifically as follows: the target vision two-dimensional information 2d _ bbox obtained by the neural network in the step 3 is processedimg_iUsing the reprojection matrix Q of step 2reprojectionRe-projecting the three-dimensional minimum bounding box to the camera coordinate system to obtain the three-dimensional minimum bounding box bbox of the i visual detection targets in the camera coordinate systemcam_i. Mixing bboxcam_iGeometric center PiAs target three-dimensional position, with bboxcam_iClassiAnd are output together as the front target visual detection result.
Further, the step 5 specifically includes: and filtering the ground point cloud and the noise point cloud by using a RANSAC algorithm, and segmenting point clouds of all road targets. And clustering all the road target point clouds into a plurality of point cloud clusters by using a clustering algorithm, wherein each point cloud cluster comprises all the point clouds of one road target. Calculating to obtain the minimum three-dimensional bounding box bbox of each target point cloud clusterlidar_k. Mixing bboxlidar_kGeometric center QkAs target three-dimensional position, with bboxlidar_kAnd the detection results are output together as the detection result of the front target radar.
Further, the step 6 specifically includes: firstly, time registration of vision and radar detection results is carried out, and the vision bbox with the time stamp difference less than 0.1 second is usedcam_iAnd radar bboxlidar_kAnd (4) regarding the information as the same time information, and performing subsequent fusion. Calibrating the obtained external parameter M by using the combination of a camera and a laser radarLR=[RLR|TLR]And converting the visual detection result into a radar coordinate system, and realizing the spatial registration of the visual detection result and the radar detection result. Visual bbox after temporal and spatial registrationcam_iAnd radar bboxlidar_kAnd 3DIOU matching cost calculation is carried out. The calculation formula is:
Figure RE-GDA0003566105140000031
wherein bboxcam_i I bboxlidar_kIs bboxcam_iAnd bboxlidar_kThe three-dimensional box volume of the intersection; bboxcam_iI bboxlidar_kIs bboxcam_iAnd bboxlidar_kA three-dimensional frame volume of the union; c. CikIs bboxcam_iAnd bboxlidar_kThe minimum length of the diagonal line surrounding the cuboid; dikIs a geometric center PiAnd QkThe euclidean distance of (c). And optimizing and solving the matching cost by using the Hungarian algorithm. By using the Hungarian matching result, visual and laser radar detection targets are divided into three categories: detected by vision only, radar simultaneously with vision. The matching is regarded as unsuccessful only by visual detection or radar detection, the matching is regarded as successful by radar and visual simultaneous detection, and visual or radar detection information is reserved for targets with unsuccessful matching. And further supplementing and correcting the successfully matched target information. The supplementary correction process comprises: supplementing the successfully matched visual category information into the category information of the fusion target; and setting vision and radar detection result correction coefficients alpha and beta, wherein alpha + beta is 1, carrying out weighted correction on the minimum three-dimensional surrounding frame and the three-dimensional position information of the target, and outputting a fusion detection result.
The invention also provides a processor, which is characterized in that the processor is used for running the program, wherein the program executes the steps when running.
The invention also provides corresponding computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the steps in the above embodiments are realized when the memory executes the program.
Compared with the prior art, the invention has the following advantages:
the invention provides a road target detection framework based on fusion of a binocular camera and a laser radar, provides a target level fusion method of the binocular camera and the laser radar, overcomes the defect of poor traditional detection effect, and realizes accurate and reliable advantage complementation among multiple sensors.
The monocular vision system target detection based on the deep learning method has low requirements on the performance of a computing platform and low cost for manufacturing and training a data set. The depth estimation based on the computer stereo vision matching algorithm is mature and easy to transplant. By combining binocular vision depth estimation and monocular neural network target detection, the defects that the monocular vision detection cannot accurately acquire target depth information and the binocular vision detection cannot efficiently perform multi-target detection are overcome.
By adopting the target level matching fusion method, the target detection results respectively output by the binocular camera and the laser radar are utilized to correct and output more accurate and reliable road target information, and the potential safety hazard of intelligent vehicle road environment perception caused by missed detection and false detection of a single sensor is reduced.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the examples serve to explain the principles of the invention and not limit the invention.
Fig. 1 is a schematic flow chart of a road target detection method according to an embodiment of the present invention;
FIG. 2 is a schematic illustration of visual output provided by an embodiment of the present invention;
fig. 3 is a schematic diagram of a fused matching cost calculation according to an embodiment of the present invention.
Detailed Description
For the purpose of making the present invention more comprehensible, and for the purpose of making the present application more comprehensible, embodiments and advantages thereof, the present invention will be further described with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
The invention provides a road target detection method for fusing a binocular camera and a laser radar, which comprises the following specific steps as shown in figure 1:
1. and collecting front road target information by using a left camera, a right camera and a laser radar.
1.1 in the step, the optical axes of the lenses of the left camera and the right camera are kept parallel, the focal lengths are consistent, the placing distance is not too large, and the imaging overlapping parts of the left camera and the right camera account for more than 80% of a single image. The positions of the binocular camera and the laser radar are fixed, and the relative positions of the binocular camera, the laser radar and the laser radar are unchanged, so that the three sensors are guaranteed to have a common sensing range.
2. And carrying out binocular calibration on the left camera and the right camera, and carrying out combined calibration on the left camera and the laser radar.
2.1 in this step, binocular calibration is performed on the left and right cameras by using a Zhang Zhengyou calibration method and an MABTLAB calibration tool to obtain internal parameters M of the left and right camerasL1,MR1Distortion coefficient DL,DRAnd an external parameter M for describing the rotation and translation relationship of the left camera and the right cameraLR=[RLR|TLR]Wherein R isLRIs a rotation matrix, T, of the left and right camerasLRIs a left and right camera translation matrix.
2.2 in this step, the specific steps of the left camera and the laser radar combined calibration are as follows: selecting 3D point P in more than 4 pairs of laser radar point cloudsLidar_iAnd corresponding image pixel point pimg_i(i 1,2, 3.) using its 3D spatial point position PLidar_i=[Xi,Yi,Zi]TAnd the 2D image projection position p of the pointimg_i=[ui,vi]TSolving a rotation translation transformation matrix of the left camera relative to the laser radar by adopting a PnP algorithm, namely an external reference MLidar_img. Thereby obtaining a two-dimensional point [ u ] of an image pixeli,vi]TThree-dimensional point [ x ] of camera coordinate systemi,yi,zi]TAnd laser radar three-dimensional point [ X ]i,Yi,Zi]TCoordinate transformation relation among the three:
[ui,vi]T=ML1[xi,yi,zi]T (1)
[xi,yi,zi]T=MLidar_img[Xi,Yi,Zi]T (2)
3. and acquiring depth information, namely a binocular disparity map, by using binocular calibration parameters through a binocular stereo matching algorithm.
3.1 in this step, first, the binocular camera image is corrected by using the distortion coefficient:
[urect_i,vrect_i]T=D[ui,vi]T (3)
wherein D is the distortion coefficient D of the left and right camerasL,DR;[ui,vi]TTwo-dimensional point coordinates of pixels of the left camera image and the right camera image are obtained; [ u ] ofrect_i,vrect_i]TTwo-dimensional point coordinates of pixels of the corrected left camera image and the corrected right camera image are obtained;
3.2 searching pixel points corresponding to the same target object of the left image and the right image through an SGBM algorithm by using binocular calibration parameters, and generating parallax d based on the pixel points of the left camera imagei=ul-urWherein u isl,urRespectively, the abscissa of the pixel point of the same target object on the left image and the right image.
3.3 obtaining two-dimensional point [ u ] of image pixel by using binocular parallax based on Bouguet algorithmi,vi]TTo a three-dimensional point [ x ] of the camera coordinate systemi,yi,zi]TThe reprojection matrix QreprojectionAnd obtaining a reprojection relation:
[xi,yi,zi]T=Qreprojectiong[ui,vi,di]T (4)
wherein, diIs a two-dimensional point [ u ] of an image pixeli,vi]TThe corresponding parallax.
4. And acquiring the category and position information of the target in the left camera image by using a visual target detection neural network based on YOLOv 4.
4.1 in this step, a target image data set is collected in advance, the road target type and position in the image are labeled, the labeled data set is used as a training set and is input into a Yolov4 visual target detection neural network, and the network prediction weight is trained. And acquiring the category and position information of the target in the left camera image by using the trained network.
4.2 inputting the image acquired by the left camera into a YOLOv4 visual target detection neural network to obtain the class information class of the ith target in the imageiAnd its minimum two-dimensional bounding box 2d _ bboximg_i(ui,vi,wi,hi) Wherein u isiIs the ith target two-dimensional minimum bounding box 2d _ bbox in the imageimg_iLeft upper corner point abscissa, viIs the ordinate of the point at the top left corner, wiIs the width of the smallest bounding box, hiIs the height of the smallest enclosing frame.
5. And combining the binocular depth map and the left camera visual detection information to obtain a front target visual detection result.
5.1 in this step, as shown in fig. 2, (a) is a schematic output diagram of the neural network for visual target detection; (b) representing different parallax values by different color areas of the parallax map obtained in the step 3; (c) is the output visual detection result. The method comprises the following specific steps: extracting the 2d _ bbox in the step 4img_i(ui,vi,wi,hi) Upper left corner point p ofLT_i(ui,vi) And the lower right corner point pRD_i(ui+wi/2,vi+hiAnd/2) re-projecting the coordinate system of the camera by the formula (4) to obtain the upper left corner point P of the target three-dimensional surrounding frameLT_i(xlt_i,ylt_i,zlt_i) And the lower right corner point PRD_i(xrd_i,yrd_i,zrd_i)。
5.2 Upper left corner PLT_iAnd the lower right corner point PRD_iForming a target three-dimensional minimum bounding box bbox of the ith visual detection target in a camera coordinate systemcam_i,bboxcam_iGeometric center PiAs an objectThree dimensional location, both and object classiAnd are output together as the front target visual detection result.
6. And carrying out point cloud segmentation and clustering on the original point cloud obtained by the laser radar to obtain a front target radar detection result.
6.1 in the step, filtering the ground point cloud and the noise point cloud from the original point cloud obtained by the laser radar by using a RANSAC algorithm, and segmenting out the point clouds of all road targets. Clustering all road target point clouds into a plurality of point cloud clusters by using a clustering algorithm, wherein each point cloud cluster comprises all point clouds Pt of one road targetk{(Xk1,Yk1,Zk1),(Xk2,Yk2,Zk2)...(Xkj,Ykj,Zkj) Where k is the kth target, (X)kj,Ykj,Zkj) Three-dimensional coordinates of the jth point cloud representing the kth target.
6.2 calculating the coordinate of the minimum bounding box corner point of each target point cloud cluster:
XLT_k=max(Xk1,Xk2,...Xkj) (5)
XRD_k=min(Xk1,Xk2,...Xkj) (6)
obtaining the upper left corner point Q of the smallest enclosing frame of the kth target point cloud cluster in the same wayLT_k(XLT_k,YLT_k,ZLT_k) Lower right corner point QRD_k(XRD_k,YRD_k,ZRD_k)。
6.3 Upper left corner QLT_kAnd the lower right corner point QRD_kForming a target three-dimensional minimum bounding box bbox of the kth radar detection target under a radar coordinate systemlidar_k,bboxlidar_kGeometric center QkAnd as the three-dimensional position of the target, outputting the three-dimensional position as the detection result of the front target radar.
7. And performing time and space registration on the vision and radar detection results, performing matching fusion on the front target detection results obtained by the vision and radar detection results, and outputting fused front target information.
7.1 in this step, time registration of vision and radar detection results is first performed, and vision bbox with a timestamp difference of less than 0.1 second is usedcam_iAnd radar bboxlidar_kAnd (4) regarding the information as the same time information, and performing subsequent fusion.
7.2 Using equation (2), bbox located in the Camera coordinate Systemcam_iAnd transforming to a radar coordinate system to realize the spatial registration of vision and radar detection results.
7.3 visual bbox after temporal and spatial registrationcam_iAnd radar bboxlidar_kThe following matching cost calculation is performed:
Figure RE-GDA0003566105140000061
as shown in fig. 3, wherein, 3DIOUikMatching costs for the three-dimensional target frame of the ith visual detection target result and the kth radar detection target result; bboxcam_i I bboxlidar_kIs bboxcam_iAnd bboxlidar_kThe three-dimensional box volume of the intersection; bboxcam_i I bboxlidar_kIs bboxcam_iAnd bboxlidar_kA three-dimensional frame volume of the union; c. CikIs bboxcam_iAnd bboxlidar_kThe minimum length of the diagonal line surrounding the cuboid; d is a radical ofikIs a geometric center PiAnd QkThe euclidean distance of (c). Using Hungarian algorithm to match cost 3DIOUikOptimizing and solving to obtain a matching result with minimum cost
Figure RE-GDA0003566105140000062
Wherein assignnIs the radar detection target result which is globally and optimally matched with the nth visual detection target.
7.4 after the matching result is obtained, dividing the vision and laser radar detection targets into three types: the classification judgment is based on the following table 1, wherein the classification judgment is detected only by vision, only by radar, and simultaneously by radar and vision. The successfully matched target is the target detected by the radar and the vision simultaneously, and the vision and radar detection is integratedAnd (3) information supplement and correction are carried out: correcting the target information successfully matched by using the formulas (8) and (9), wherein alpha is 0.35, and beta is 0.65, which is a correction coefficient; assigning target-matched visual classiTarget corrected three-dimensional position FiAnd correcting the three-dimensional minimum bounding box bboxfusion_i. Visual or radar detection information is kept for the target with unsuccessful matching, and the fusion strategy is shown in table 1.
Fi=αPi+βQk (8)
bboxfusion_i=α bboxcam_i+β bboxlidar_k (9)
TABLE 1
Figure BDA0003495061110000063
Figure BDA0003495061110000071
The invention also provides a programmable processor of any type (FPGA, ASIC or other integrated circuit) for running a program, wherein the program performs the steps of the above embodiments when running.
The invention also provides corresponding computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the steps in the embodiment are realized when the memory executes the program.
Although the embodiments of the present invention have been described above, the above description is only for the convenience of understanding the present invention, and is not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the scope of the present invention should be determined by the following claims.

Claims (10)

1. A road target detection method based on binocular camera and laser radar fusion is characterized by comprising the following steps:
step 1, collecting front road target information by using a left camera, a right camera and a laser radar;
step 2, obtaining binocular parallax information through a binocular stereo matching algorithm;
step 3, utilizing a visual target detection neural network based on YOLOv4 to acquire the category and two-dimensional position information of a target in the left camera image as left camera visual detection information;
step 4, combining binocular parallax information and left camera vision detection information to obtain a three-dimensional vision detection result of the front target;
step 5, carrying out point cloud segmentation and clustering on the original point cloud obtained by the laser radar to obtain a front target radar detection result;
and 6, performing time and space registration on the visual detection result and the radar detection result, performing matching fusion on the front target detection results obtained by the visual detection result and the radar detection result, and outputting fused front target information.
2. The method of claim 1, wherein step 1 comprises:
a binocular camera system is formed by a left camera and a right camera which are arranged in parallel, and left image information and right image information of a front road target are respectively collected; collecting point cloud information of a front road area through a laser radar;
the relative positions of the camera and the radar are kept unchanged, and before the camera and the laser radar are used for collecting road information, the left camera and the right camera are calibrated in a binocular mode, and the left camera and the laser radar are calibrated in a combined mode.
3. The method of claim 2, wherein the calibrating step specifically comprises:
according to the images collected by the left and right cameras, a Zhang-friend calibration method is utilizedAnd MABTLAB calibration tool for obtaining internal reference M of left camera and right cameraL1,MR1Distortion coefficient DL,DR
Selecting 4 pairs of 3D points in the laser radar point cloud and corresponding image pixel points thereof, and utilizing the 3D space point position P of the pointsLidar_iAnd the 2D image projection position p of the pointimg_iObtaining external parameter M between the left camera and the laser radar by adopting a PnP algorithmLR=[RLR|TLR]And finishing the combined calibration.
4. The method of claim 1, wherein the step 2 comprises: performing stereo correction on images acquired by a left camera and a right camera by utilizing internal and external parameters calibrated by a binocular camera; searching pixel points corresponding to the same target object of the left image and the right image through an SGBM algorithm, and generating parallax based on the pixel points of the left camera image; acquiring a two-dimensional point [ u ] of an image pixel based on a Bouguet algorithm by using binocular parallaxi,vi]TTo a three-dimensional point [ x ] of a camera coordinate systemi,yi,zi]TThe reprojection matrix Qreprojection
5. The method of claim 1, wherein step 3 comprises:
acquiring a target image data set in advance, labeling the road target type and position in the image, inputting the labeled data set serving as a training set into a YOLOv4 visual target detection neural network, and training out a network prediction weight;
inputting the image acquired by the left camera into a YOLOv4 visual target detection neural network to obtain the class information class of the ith target in the imageiAnd its minimum two-dimensional bounding box 2d _ bboximg_i(ui,vi,wi,hi) Wherein u isiIs the ith target two-dimensional minimum bounding box 2d _ bbox in the imageimg_iLeft upper corner point abscissa, viIs the ordinate of the upper left corner point, wiIs the width of the smallest bounding box, hiIs the height of the smallest enclosing frame.
6. The method of claim 4, wherein step 4 comprises: the target vision two-dimensional information 2d _ bbox obtained by the neural network in the step 3 is processedimg_iUsing the reprojection matrix Q of step 2reprojectionRe-projecting the three-dimensional minimum bounding box to the camera coordinate system to obtain the three-dimensional minimum bounding box bbox of the i visual detection targets in the camera coordinate systemcam_i(ii) a Mixing bboxcam_iGeometric center PiAs target three-dimensional position, with bboxcam_iClassiAnd are output together as the front target visual detection result.
7. The method of claim 1, wherein the step 5 comprises: filtering ground point cloud and noise point cloud by using RANSAC algorithm, and segmenting point cloud of all road targets; clustering all road target point clouds into a plurality of point cloud clusters by using a clustering algorithm, wherein each point cloud cluster comprises all point clouds of one road target; calculating to obtain the minimum three-dimensional bounding box bbox of each target point cloud clusterlidar_k(ii) a Mixing bboxlidar_kGeometric center QkAs target three-dimensional position, with bboxlidar_kAnd the detection results are output together as the detection result of the front target radar.
8. The method of claim 1, wherein the step 6 comprises:
firstly, time registration of vision and radar detection results is carried out, and the vision bbox with the time stamp difference smaller than the preset time is obtainedcam_iAnd radar bboxlidar_kThe information is regarded as the same time information, and subsequent fusion is carried out;
calibrating the obtained external parameter M by using the combination of a camera and a laser radarLR=[RLR|TLR]Converting the visual detection result into a radar coordinate system to realize the spatial registration of the visual detection result and the radar detection result;
visual bbox after temporal and spatial registrationcam_iAnd radar bboxlidar_kAnd 3DIOU matching cost calculation is carried out.
9. The method of claim 8, wherein the matching cost calculation formula is:
Figure FDA0003495061100000021
wherein bboxcam_iI bboxlidar_kIs bboxcam_iAnd bboxlidar_kThe three-dimensional frame volume of the intersection; bboxcam_iI bboxlidar_kIs bboxcam_iAnd bboxlidar_kA three-dimensional frame volume of the union; c. CikIs bboxcam_iAnd bboxlidar_kThe minimum length of the diagonal line surrounding the cuboid; dikIs a geometric center PiAnd QkThe euclidean distance of (c).
10. The method of claim 8, wherein the matching results are used to classify visual and lidar detection targets into three categories: detected by vision only, radar and vision simultaneously; and only the visual detection or only the radar detection is regarded as unsuccessful matching, the radar and the visual detection are regarded as successful matching, the visual or radar detection information is reserved for the target which is unsuccessfully matched, and the information of the target which is successfully matched is further supplemented and corrected.
CN202210110972.0A 2022-01-29 2022-01-29 Road target detection method based on fusion of binocular camera and laser radar Pending CN114463303A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210110972.0A CN114463303A (en) 2022-01-29 2022-01-29 Road target detection method based on fusion of binocular camera and laser radar

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210110972.0A CN114463303A (en) 2022-01-29 2022-01-29 Road target detection method based on fusion of binocular camera and laser radar

Publications (1)

Publication Number Publication Date
CN114463303A true CN114463303A (en) 2022-05-10

Family

ID=81410690

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210110972.0A Pending CN114463303A (en) 2022-01-29 2022-01-29 Road target detection method based on fusion of binocular camera and laser radar

Country Status (1)

Country Link
CN (1) CN114463303A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115830079A (en) * 2023-02-15 2023-03-21 天翼交通科技有限公司 Method, device and medium for tracking trajectory of traffic participant
CN117372987A (en) * 2023-12-08 2024-01-09 山东高速工程检测有限公司 Road three-dimensional data processing method and device, storage medium and electronic equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115830079A (en) * 2023-02-15 2023-03-21 天翼交通科技有限公司 Method, device and medium for tracking trajectory of traffic participant
CN117372987A (en) * 2023-12-08 2024-01-09 山东高速工程检测有限公司 Road three-dimensional data processing method and device, storage medium and electronic equipment
CN117372987B (en) * 2023-12-08 2024-01-30 山东高速工程检测有限公司 Road three-dimensional data processing method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN110942449B (en) Vehicle detection method based on laser and vision fusion
CN111951305B (en) Target detection and motion state estimation method based on vision and laser radar
CN112396650B (en) Target ranging system and method based on fusion of image and laser radar
CN110555407B (en) Pavement vehicle space identification method and electronic equipment
CN109410264B (en) Front vehicle distance measuring method based on laser point cloud and image fusion
CN110288659B (en) Depth imaging and information acquisition method based on binocular vision
CN109263637B (en) Collision prediction method and device
CN114463303A (en) Road target detection method based on fusion of binocular camera and laser radar
CN114359181B (en) Intelligent traffic target fusion detection method and system based on image and point cloud
CN108645375B (en) Rapid vehicle distance measurement optimization method for vehicle-mounted binocular system
CN107796373B (en) Distance measurement method based on monocular vision of front vehicle driven by lane plane geometric model
CN115032651A (en) Target detection method based on fusion of laser radar and machine vision
CN113205604A (en) Feasible region detection method based on camera and laser radar
WO2022183685A1 (en) Target detection method, electronic medium and computer storage medium
CN113743391A (en) Three-dimensional obstacle detection system and method applied to low-speed autonomous driving robot
CN111723778B (en) Vehicle distance measuring system and method based on MobileNet-SSD
CN113920183A (en) Monocular vision-based vehicle front obstacle distance measurement method
CN111323767B (en) System and method for detecting obstacle of unmanned vehicle at night
CN115546741A (en) Binocular vision and laser radar unmanned ship marine environment obstacle identification method
CN115187941A (en) Target detection positioning method, system, equipment and storage medium
CN113327296B (en) Laser radar and camera online combined calibration method based on depth weighting
CN113988197A (en) Multi-camera and multi-laser radar based combined calibration and target fusion detection method
CN110197104B (en) Distance measurement method and device based on vehicle
CN114298151A (en) 3D target detection method based on point cloud data and image data fusion
CN111951339A (en) Image processing method for performing parallax calculation by using heterogeneous binocular cameras

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination