CN115032651B - Target detection method based on laser radar and machine vision fusion - Google Patents

Target detection method based on laser radar and machine vision fusion Download PDF

Info

Publication number
CN115032651B
CN115032651B CN202210630026.9A CN202210630026A CN115032651B CN 115032651 B CN115032651 B CN 115032651B CN 202210630026 A CN202210630026 A CN 202210630026A CN 115032651 B CN115032651 B CN 115032651B
Authority
CN
China
Prior art keywords
frame
target
information
camera
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210630026.9A
Other languages
Chinese (zh)
Other versions
CN115032651A (en
Inventor
张炳力
王怿昕
姜俊昭
徐雨强
王欣雨
王焱辉
杨程磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202210630026.9A priority Critical patent/CN115032651B/en
Publication of CN115032651A publication Critical patent/CN115032651A/en
Application granted granted Critical
Publication of CN115032651B publication Critical patent/CN115032651B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/93Lidar systems specially adapted for specific applications for anti-collision purposes
    • G01S17/931Lidar systems specially adapted for specific applications for anti-collision purposes of land vehicles
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/66Tracking systems using electromagnetic waves other than radio waves
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/86Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/48Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
    • G01S7/4802Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10044Radar image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Electromagnetism (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Optical Radar Systems And Details Thereof (AREA)

Abstract

The invention discloses a target detection method based on laser radar and machine vision fusion, which comprises the following steps: 1. arranging a laser radar and a camera at the corresponding position of the vehicle; 2. processing the point cloud information acquired by the laser radar to output a radar detection frame; 3. processing the image information acquired by the camera to output a visual detection frame; 4. carrying out space-time synchronization on the information processed by the laser radar and the camera; 5. carrying out data association on the information after time-space synchronization to obtain an association pair; 6. and carrying out data fusion on the obtained association pairs, tracking the fused targets, and outputting a final fusion result by integrating continuous frame target information. The invention can avoid the problem that a large number of false detection and omission detection exist in the data association and fusion process in the target detection process based on multi-sensor fusion, thereby ensuring the accuracy of the evaluation of the sensing environment and ensuring the accurate execution of planning control.

Description

Target detection method based on laser radar and machine vision fusion
Technical Field
The invention relates to the technical field of environment sensing based on multi-sensor fusion, in particular to a target detection method based on laser radar and machine vision fusion.
Background
The perception technology is taken as the most basic of the unmanned technology and is also the most important, and the accuracy and instantaneity of understanding the targets around the vehicle directly determine the overall level of the unmanned system. The method is limited by the working principles of different sensors in the execution of sensing tasks, and accurate and comprehensive obstacle information is required to be acquired, so that a single sensor cannot be realized, and the research on a multi-sensor fusion technology becomes necessary.
The data fusion method commonly used at present can be divided into pre-fusion and post-fusion. The pre-fusion comprises data-level fusion and feature-level fusion, and the post-fusion is mainly decision-level fusion.
If the pre-fusion mode is selected, both the data level fusion and the feature level fusion depend on a deep learning framework, so that the network architecture is more complex, and the requirements on the GPU are also improved. In post fusion, a comprehensive fusion strategy needs to be provided in a decision-level fusion method to solve target recognition in various scenes, and most methods select the situation that unusual obstacles are missed due to the fact that the interested areas are formed visually, so that the problem of missing detection and false detection is solved without further processing of the fusion targets.
Specifically, park et al generate a high resolution dense disparity map based on a two-stage convolutional neural network using dense stereo disparities and point clouds, generate fused disparities using lidar and stereo disparities and fuse the fused disparities with images in feature space, predict the final high resolution disparities and reconstruct a 3D scene using such high resolution disparities, which is limited by the need for large-scale labeled stereo lidar datasets. Liang et al realized point-by-point fusion through one continuous convolution fusion layer, connecting the roles of images of different scales and point cloud features in multiple stages of the network. Firstly, K nearest neighbors are extracted for each pixel, then the points are projected onto an image, relevant image features are searched, finally, the fusion feature vectors are weighted according to the geometric offset between the fusion feature vectors and the target pixels, and then the fusion feature vectors are input into a neural network. However, when the radar resolution is low or the distance is long, the point fusion may cause a problem that the high-resolution image cannot be fully utilized.
Disclosure of Invention
Aiming at the problems existing in the existing method, the invention provides the target combination method based on the fusion of the laser radar and the machine vision, so as to realize the fusion of multi-sensor information in the target detection process, thereby ensuring the accuracy of the evaluation of the sensing environment and ensuring the accurate operation of planning control.
In order to achieve the aim of the invention, the invention adopts the following technical scheme:
the invention discloses a target detection method based on fusion of a laser radar and machine vision, which is characterized by comprising the following steps of:
A. a solid laser radar is arranged at the front bumper of the vehicle, a camera is arranged at the front windshield of the vehicle, the advancing direction of the vehicle is taken as a Z axis, the left direction pointing to the driver is taken as an X axis, the right direction pointing to the vehicle is taken as a Y axis,laser emission center of laser radar is used as camera origin O l Establishing a laser radar coordinate system O l -X l Y l Z l And takes the focus center of the camera as the origin O of the camera coordinate system c Establishing a camera coordinate system O c -X c Y c Z c The O-XZ surfaces of the two coordinate systems are kept horizontal with the ground;
B. processing each frame of point cloud information acquired by the laser radar comprises the following steps: firstly, carrying out ground point cloud segmentation on point cloud information by a multi-plane fitting method, extracting road edge points from the obtained segmentation result, and sequentially carrying out curve fitting, filtering and downsampling on the extracted road edge points to obtain an interested region of each frame; clustering the point cloud in the region of interest to obtain each target after each frame of clustering, and identifying each clustered target by using a three-dimensional detection frame; wherein, the q-th target after p-th frame clustering utilizes the q-th three-dimensional detection frameTo mark (I)>X-axis coordinate representing the center point of the q-th three-dimensional detection frame in the p-th frame, +.>Representing the y-axis coordinate of the center point of the q-th three-dimensional detection frame in the p-th frame, +.>Z-axis coordinate representing the center point of the q-th three-dimensional detection frame in the p-th frame, +.>Represents the width of the q-th three-dimensional detection frame in the p-th frame,/or->Representing the length of the q-th three-dimensional detection frame in the p-th frame,/for the frame>Representing the height of the q-th three-dimensional detection frame in the p-th frame; selecting a two-dimensional detection frame which is closest to the laser radar in the three-dimensional detection frames>Characterizing the q-th target after clustering in the p-th frame; thereby obtaining a point cloud data set with a detection frame;
C. constructing a yolov5 model by adopting a convolution attention module, training the yolov5 model by utilizing a road vehicle image data set to obtain a trained yolov5 model, processing each frame of image information acquired by the camera by utilizing the trained yolov5 model, and outputting a detection frame of each target in each frame of image information and coordinate, size, category and confidence information thereof, thereby obtaining an image information set with the detection frame;
D. performing space-time synchronization on the point cloud information set and the image information set, including: the method comprises the steps of taking a laser radar signal as a reference of registration frequency, aligning time stamps of the laser radar and a camera by using an interpolation method, and thus obtaining a point cloud information set of the laser radar and an image information set of the camera at the same moment; then calibrating the camera to obtain an internal reference of the camera, calibrating the camera and the laser radar in a combined way, and obtaining an external reference, so that a two-dimensional detection frame under a laser radar coordinate system is projected under a pixel coordinate system, and a projected two-dimensional detection frame is obtainedWherein (1)>X-axis coordinate representing center point of two-dimensional detection frame after the q-th projection, +.>Representing the y-axis coordinate of the center point of the two-dimensional detection frame after the q-th projection, +.>Represents the width of the two-dimensional detection frame after the q-th projection,/->Representing the height of the two-dimensional detection frame after the q-th projection;
E. carrying out data association on the information after time-space synchronization to obtain an association pair:
e1, setting the association threshold as r th The method comprises the steps of carrying out a first treatment on the surface of the Defining a variable i to represent the number of frames of the laser radar after time synchronization with the camera, defining a variable j to represent the current target number contained in the point cloud data of the laser radar for observing the ith frame, defining a variable k to represent the current target number contained in the image data of the camera for observing the ith frame, and initializing i=1;
e2, initializing j=1; taking the coordinate and size information of the j-th projected two-dimensional detection frame in the point cloud data set of the i-th frame laser radar as the j-th radar target observation information of the i-th frameThe jth radar target observation information of the ith frame +.>The corresponding three-dimensional detection frame is taken as the j three-dimensional detection frame after the i frame is clustered>
E3, initializing k=1; taking the coordinates, the size, the category and the confidence information of the kth detection frame in the image information set of the ith frame camera as the kth camera target observation information of the ith frameWherein (1)>Representing the x-axis coordinates of the center point of the kth detection frame,/>y-axis coordinate representing the center point of the kth detection frame,/->Represents the width of the kth detection frame, +.>Indicating the height of the kth detection box, < >>Class of detected target of kth detection frame,/->Confidence information representing a kth detection frame;
e4, calculating the j-th laser radar target observation information of the i-th frameThe kth camera object observation information +.>Euclidean distance between->
E5, judgingIf so, the detection target of the laser radar is successfully matched with the detection target of the camera, and the j-th radar target observation information of the i-th frame is +.>The kth camera object observation information +.>The correlation pairs are formed, otherwise, the matching failure is indicated;
e6, after k+1 is assigned to k, returning to the step E3 for sequential execution until all camera target observation information of the ith frame is traversed, after j+1 is assigned to j, returning to the step E2 until all targets of the ith frame are traversed;
e7, calculating the j-th radar target observation information of the i-th frameThe kth camera object observation information +.>Cross ratio of the correlation pair>And with the set cross-over threshold IOU th Comparing if->The corresponding association pair in the ith frame is indicated to be correct and output, otherwise, the corresponding association pair in the ith frame is omitted, E7 is returned to calculate the next association pair in the ith frame until all correct association pairs in the ith frame are output;
F. data fusion is carried out on all correct association pairs in the ith frame to obtain target detection information after the fusion of the ith frame, wherein the target detection information comprises: if the mth radar target observation information of the ith frameWith the nth camera object observation informationThe relation is a correlation pair, and the x-axis coordinate in the three-dimensional detection frame corresponding to the mth radar target observation information of the ith frame is directly added>y-axis coordinate->z-axis coordinate +.>Length->Width->And class +.f in nth camera object observation information>Confidence information->Directly serving as fused partial target detection information of the corresponding association pair, and then converting an nth camera target observation frame into a radar coordinate system by utilizing the participation of the camera in the step D and the participation of the camera in the step D, thereby obtaining an nth camera target observation height +.>Projection in a radar coordinate system +.>And as the target detection height compensation information after fusion of the corresponding association pair, the target detection information after fusion is formed by partial target detection information after fusion and target detection height compensation information;
G. tracking each target in the target detection information fused in the ith frame and outputting a target detection result.
2. The method for detecting a target based on fusion of laser radar and machine vision according to claim 1, wherein in said E5, if the j-th radar target observation information of the i-th frameObservation information of any one camera object from the ith frame +.>Euclidean distance between->Are all greater than r th Then j radar target observation information of i frame +.>Outputting and carrying out target tracking;
if the corresponding radar target observation information is detected in the (i+1) th frameAnd corresponding radar target observation information +.>The kth camera target observation information with the (i+1) th frameEuclidean distance between->Then consider the jth radar target observation information +.>The target was successfully detected.
Compared with the prior art, the invention has the beneficial effects that:
1. aiming at the problems of a large number of false detection and omission in the data association and fusion process in the target detection process based on multi-sensor fusion, the method takes the accurate result of the fusion of the laser radar and the image information as a target, firstly, the laser radar is used for collecting multi-target point cloud data, generating a laser radar detection frame of the target after ground point cloud segmentation, region of interest extraction and clustering, then, a machine vision detection frame of the target is generated by using a yolov5 algorithm improved by a convolution attention module, and the detection result of the laser radar and the machine vision is associated to obtain an association pair by setting a reasonable threshold; compared with NN algorithm with weak anti-interference capability, association errors are easy to occur, and the method has the advantages that whether the threshold is met or not is judged through calculating the cross-correlation ratio (IOU) between the association pairs, if yes, the threshold is output, otherwise, sub-optimal association is selected to recalculate the cross-correlation ratio until the threshold is met, an accurate association pair is obtained, the situation that the NN algorithm is erroneously associated in the data association process is effectively avoided by utilizing the IOU, and therefore accuracy of target detection based on multi-sensor fusion is improved, and accurate execution of planning control is ensured.
2. The invention provides a decision method under the condition that a laser radar and a machine vision cannot be matched in the data fusion process, which can further screen the target which is not successfully matched in the data association process, thereby reducing the probability of the condition of missed detection of the target in the data fusion process.
3. The invention provides a target fusion method based on laser radar and machine vision. Firstly, directly adding object information which can be output by a single sensor into a fused target; and then the position information and the width information of the object are directly obtained by adopting a laser radar, and the height information is dynamically compensated by adopting a mode of converting a pixel frame into a radar coordinate system, wherein depth information provided by the laser radar is used as a basis for calculating projection of a detection frame in the pixel coordinate system to a camera coordinate system. Compared with the method of M.Liang et al, the method of the invention utilizes the image information to compensate the height information of the laser radar, thereby solving the problem that the high-resolution image can not be fully utilized in the point fusion process.
Drawings
FIG. 1 is an overall flow chart of a target detection method based on laser radar and machine vision fusion in accordance with the present invention;
FIG. 2a is a view of a laser radar detection scene of the present invention;
FIG. 2b is a diagram showing the detection effect of the laser radar according to the present invention;
FIG. 3 is a diagram showing the machine vision inspection effect of the present invention;
FIG. 4 is a time synchronization schematic diagram of the present invention;
FIG. 5 is a graph showing the combined calibration effect of the laser radar and the camera of the invention;
FIG. 6 is a diagram of possible association situations in the context of the target association scenario of the present invention;
FIG. 7 is a diagram of a decision making method in a scene of successful target mismatch in the present invention;
FIG. 8 is a diagram of a data fusion method according to the present invention.
Detailed Description
In this embodiment, a target detection method based on fusion of laser radar and machine vision, as shown in fig. 1, includes the following steps:
A. a solid-state laser radar is arranged at the front bumper of the vehicle, a camera is arranged at the front windshield of the vehicle, the advancing direction of the vehicle is taken as a Z axis, the left direction pointing to the driver is taken as an X axis, the right direction pointing to the vehicle is taken as a Y axis, and the laser emission center of the laser radar is taken as a camera origin O l Establishing a laser radar coordinate system O l -X l Y l Z l And takes the focus center of the camera as the origin O of the camera coordinate system c Establishing a camera coordinate system O c -X c Y c Z c The O-XZ surfaces of the two coordinate systems are kept horizontal with the ground;
B. processing the point cloud information acquired by the laser radar, including:
b1, carrying out ground point cloud segmentation on point cloud information through a multi-plane fitting method: dividing each frame of laser point cloud into a plurality of areas along the running direction of the vehicle, calculating the average value RPA (region point average) of the lowest height points in the areas, eliminating the influence of noise point cloud, and setting a height threshold h th Meet h based on RPA th As a seed point set; according to the seed point fitting plane, a simple linear plane model is selected as shown in formula (1):
Ax+By+Cz+D=0 (1)
in the formula (1), (A, B, C) is a normal vector of the plane, and D is a distance required for translating the plane to the origin of coordinates;
thereby obtaining an initial plane model, and setting a distance threshold D th =0.2m, the distance d between points in the region and the plane is calculated from the distance equation (2) between points in the solid geometry and the plane:
in the formula (2), x, y and z are three-dimensional coordinates of the point cloud. If satisfy d<D th Adding the point to the ground point set, otherwise, considering the point as a non-ground point; using the obtained ground points as an initial set of the next iteration, and completing the segmentation of the ground point cloud after 3 optimization iterations;
b2, after extracting road edge points of the obtained segmentation result, sequentially carrying out curve fitting, filtering and downsampling on the extracted road edge points to obtain an interested region of each frame, wherein the extraction of the interested region is to consider that the occupied ratio is the most in all invalid target information, the influence on target detection is the invalid point cloud targets such as pedestrian targets on a pavement, trees and buildings on two sides of the pavement in the y-axis direction, and the like, and consider that a structured urban road can distinguish a vehicle driving region from a non-vehicle driving region by the road edge, and the point cloud information intensive by a laser radar is very suitable for identifying the road edge to obtain the interested region (ROI, region of interesting); then, extracting a position road edge candidate point by utilizing the characteristic that two adjacent points on the same scanning line have mutation at the road edge, classifying the position road edge candidate point into a left road edge and a right road edge according to the positive and negative of the y coordinates of the points, adding the value into the left road edge point if the value is positive, and adding the value into the right road edge point if the value is negative; finally, fitting the left and right road edges by using a linear model in the RANSAC according to the extracted road edge points to finish the extraction of the region of interest;
b3, clustering the point clouds in the region of interest to obtain each clustered frameThe targets are identified by using a three-dimensional detection frame; wherein, the q-th target after p-th frame clustering utilizes the q-th three-dimensional detection frameTo mark (I)>X-axis coordinate representing the center point of the q-th three-dimensional detection frame in the p-th frame, +.>Representing the y-axis coordinate of the center point of the q-th three-dimensional detection frame in the p-th frame, +.>Z-axis coordinate representing the center point of the q-th three-dimensional detection frame in the p-th frame, +.>Represents the width of the q-th three-dimensional detection frame in the p-th frame,/or->Representing the length of the q-th three-dimensional detection frame in the p-th frame,/for the frame>Representing the height of the q-th three-dimensional detection frame in the p-th frame; selecting a two-dimensional detection frame which is closest to the laser radar in the three-dimensional detection frames>Characterizing a q-th target after clustering in a p-th frame; the point cloud data set with the detection frame is obtained, wherein the clustering in the step B3 is completed by using a DBSCAN algorithm, wherein in order to avoid that distant targets cannot be clustered, two objects with close proximity distance are clustered into one type when the distance is large, different epsilon thresholds are set for improving the clustering effect, and considering that the horizontal angle resolution of the laser radar is generally higher than the vertical angle resolution, the clustering effect is improved by using the vertical angle resolutionAngular resolution setting distance adaptive threshold epsilon th The expression (3) can be used to determine:
ε th =kh (3)
in the formula (3), k=1.1 is an amplification factor, h is the height between two scanning lines in the vertical direction when the laser radar is at a certain distance, a clustered target is obtained, the clustered target is framed by a detection frame nearest to the radar to represent target information, fig. 2a is a certain detection scene graph, and fig. 2b is a detection effect graph corresponding to processing output;
C. a convolution attention module is adopted to construct a yolov5 model, the road vehicle image data set is utilized to train the yolov5 model, a trained yolov5 model is obtained, each frame of image information acquired by a camera is processed by utilizing the trained yolov5 model, detection frames of all targets in each frame of image information and coordinate, size, category and confidence information of the targets are output, so that an image information set with the detection frames is obtained, wherein the convolution attention module consists of a channel attention module and a space attention module, the channel attention module calculates attention force diagram in a channel dimension, the space attention module is input after multiplication with a feature diagram, the space attention module further calculates a feature diagram in an high-wide dimension, the attention feature diagram is output after multiplication with the input, and the network is induced to focus on learning of important features correctly; selecting a part of pictures which are closer to the data set in the public data set, modifying the categories and deleting the unnecessary targets in the pictures, and then manufacturing the rest part of data sets by oneself, wherein the total number of the data sets is 6000, and the proportion of the training set to the verification set is 5:1, completing the establishment of a data set, and fig. 3 is a graph of the detection effect of the improved yolov5 identification output.
D. Performing space-time synchronization on information processed by the laser radar and the camera, including:
d1, using a laser radar signal as a reference of registration frequency, aligning time stamps of the laser radar and a camera by using an interpolation method, obtaining point cloud information of the laser radar and image information of the camera at the same moment, and if target information corresponding to a camera at the moment of 100ms is to be obtained, calculating corresponding data information at the moment of 100ms by interpolation of information acquired by the cameras at 67ms and 133ms through (4), as shown in fig. 4.
In the formula (4), t i For the time before interpolation, t i+1 For the time after interpolation, t j For interpolation time, x i For the x-axis coordinate information of the moment before interpolation, x i+1 For coordinate information of x axis at time after interpolation, x j In order to obtain the x-axis coordinate information of the interpolation time, when an interpolation method is used, the interval between the selected interpolation time and the front and back data frames needs to be ensured not to be higher than 67ms of the sampling period of the camera, and if the sampling period of the camera is exceeded, the invalid interpolation time is considered to be removed;
d2, performing space-time synchronization on the point cloud information set and the image information set, including: the method comprises the steps of taking a laser radar signal as a reference of registration frequency, aligning time stamps of the laser radar and a camera by using an interpolation method, and thus obtaining a point cloud information set of the laser radar and an image information set of the camera at the same moment; then calibrating the camera internal parameters by utilizing an automatic calibration method based on Zhang Zhengyou, and then acquiring an external parameter matrix between the radar and the camera by utilizing a Calibration Toolkit tool kit separated from the automatic calibration method, wherein FIG. 5 is a laser radar and camera combined calibration effect diagram; thereby projecting the two-dimensional detection frame under the laser radar coordinate system to the pixel coordinate system to obtain the projected two-dimensional detection frameWherein (1)>X-axis coordinate representing center point of two-dimensional detection frame after the q-th projection, +.>Representing the y-axis coordinate of the center point of the two-dimensional detection frame after the q-th projection,represents the width of the two-dimensional detection frame after the q-th projection,/->The height of the two-dimensional detection frame after the q-th projection is shown.
E. Carrying out data association on the information after time-space synchronization to obtain an association pair:
e1, setting a correlation threshold r th Considering that too large a threshold can lead to complex matching conditions and affect algorithm accuracy, too small a threshold can lead to matching failure, r is set th Circular threshold=0.5 meters; defining a variable i to represent the number of frames of the laser radar and the camera after time synchronization, defining a variable j to represent the current target number contained in the point cloud data of the laser radar for observing the ith frame, defining a variable k to represent the current target number contained in the image data of the camera for observing the ith frame, and initializing i=1;
e2, initializing j=1; taking the coordinate and size information of the j-th projected two-dimensional detection frame in the point cloud data set of the i-th frame laser radar as the j-th radar target observation information of the i-th frameJth radar target observation information of ith frame +.>The corresponding three-dimensional detection frame is the j-th three-dimensional detection frame after the i-th frame is clustered
E3, initializing k=1; taking the coordinates, the size, the category and the confidence information of the kth detection frame in the image information set of the ith frame camera as the kth camera target observation information of the ith frameWherein (1)>X-axis coordinate representing the center point of the kth detection frame,/->Y-axis coordinate representing the center point of the kth detection frame,/->Represents the width of the kth detection frame, +.>Indicating the height of the kth detection box, < >>Class of detected target of kth detection frame,/->Confidence information representing a kth detection frame;
e4, calculating the j-th laser radar target observation information of the i-th frameThe kth camera object observation information +.>Euclidean distance between->E5, judging->If so, the detection target of the laser radar is successfully matched with the detection target of the camera, and the j-th radar target observation information of the i-th frame is +.>The kth camera object observation information +.>The correlation pairs are formed, otherwise, the situation that the correlation of the decision diagram possibly occurs when the matching failure figure 6 is the target unmatched is shown. When the matching is unsuccessful, the decision method is as follows: for targets detected by the radar but not detected by vision, the reason that the view angles are different is ignored because the region of interest is extracted by the laser radar is possibly that the vision cannot be detected when the light conditions such as the category of objects with untrained vision such as animals, cone barrels and the like on the road and the evening are bad, the objects possibly have influence on the safe running of the vehicle, the effect is reserved, and if the j-th radar target observation information of the i-th frame is->Observation information of any one camera object from the ith frame +.>Euclidean distance between->Are all greater than r th Then j radar target observation information of i frame +.>Outputting and carrying out target tracking; if the corresponding radar target observation information is detected in the (i+1) th frame +.>And corresponding radar target observation informationThe kth camera object observation information +.1 to the (i+1) th frame>Euclidean distance between->Then consider the jth radar target observation information +.>Successfully detecting the target; for targets which are detected visually and not detected by the radar, the possible reasons are that the clustering precision of the laser radar with the target distance too far is not reached, the conditions directly remove the visual recognition targets, meanwhile, the field of view of a camera is larger than targets such as pedestrians on some road edges recognized by the radar interest area, and the like, and the objects cannot influence the safe running of the vehicle and are ignored; for targets detected by vision and radar, the reason that the radar algorithm cannot cluster and distinguish the targets with too close distance between pedestrians and vehicles generally appears is that the radar detection result is reserved, as shown in fig. 7, L represents the laser radar detection result, C represents the camera detection result, wherein L1 and C1 are successfully paired, L2 is reserved, and C2 is ignored.
E6, after k+1 is assigned to k, returning to the step E4 for sequential execution until all camera target observation information of the ith frame is traversed, after j+1 is assigned to j, returning to the step E3 until all targets of the ith frame are traversed;
e7, calculating the j-th radar target observation information of the i-th frameThe kth camera object observation information +.>Cross ratio of the correlation pair>And with the set cross-over threshold IOU th Comparing, selecting IOU through instance test th =0.7, if->The corresponding association pair in the ith frame is indicated to be correct and output, otherwise, the corresponding association pair in the ith frame is omitted, E7 is returned to calculate the next association pair in the ith frame until all correct association pairs in the ith frame are output;
F. according to the characteristics of the output data of different sensors, data fusion is performed on all correct association pairs in the ith frame to obtain target detection information after the fusion of the ith frame, as shown in fig. 8, including:
f1, since the laser radar can output depth information of the target and the camera can output category and confidence information of the object, if the mth radar target observation information of the ith frameObservation information about nth camera object>The relation is a correlation pair, and the x-axis coordinate in the three-dimensional detection frame information corresponding to the mth radar target observation information of the ith frame is directly added>y-axis coordinate->z-axis coordinate +.>Length->Width->And class +.f in nth camera object observation information>Confidence information->Fused partial target detection information directly serving as corresponding association pair
F2, when the laser radar detects the target, the laser scanning line on the height of the target is sparse as the target distance is farther, so that the loss of the height information occurs, and the n-th camera target observation frame is replaced under the radar coordinate system by utilizing the participation of the internal parameters and the external parameters of the camera marked in the step D, so that the n-th camera target observation height information is obtainedProjection in a radar coordinate system +.>Outputting the fused target detection height compensation information serving as the correlation pair, and further obtaining fused target detection data;
G. the method of the invention selects an Extended Kalman Filter (EKF) to track the fusion target.

Claims (2)

1. The target detection method based on the fusion of the laser radar and the machine vision is characterized by comprising the following steps of:
A. a solid-state laser radar is arranged at the front bumper of the vehicle, a camera is arranged at the front windshield of the vehicle, the advancing direction of the vehicle is taken as a Z axis, the left direction pointing to the driver is taken as an X axis, the right direction pointing to the vehicle is taken as a Y axis, and the laser emission center of the laser radar is taken as a camera origin O l Establishing a laser radar coordinate system O l -X l Y l Z l And takes the focus center of the camera as the origin O of the camera coordinate system c Establishing a camera coordinate system O c -Z c Y c Z c The O-XZ surfaces of the two coordinate systems are kept horizontal with the ground;
B. for each frame point acquired by the laser radarCloud information processing includes: firstly, carrying out ground point cloud segmentation on point cloud information by a multi-plane fitting method, extracting road edge points from the obtained segmentation result, and sequentially carrying out curve fitting, filtering and downsampling on the extracted road edge points to obtain an interested region of each frame; clustering the point cloud in the region of interest to obtain each target after each frame of clustering, and identifying each clustered target by using a three-dimensional detection frame; wherein, the q-th target after p-th frame clustering utilizes the q-th three-dimensional detection frameTo mark (I)>X-axis coordinate representing the center point of the q-th three-dimensional detection frame in the p-th frame, +.>Representing the y-axis coordinate of the center point of the q-th three-dimensional detection frame in the p-th frame, +.>Z-axis coordinate representing the center point of the q-th three-dimensional detection frame in the p-th frame, +.>Represents the width of the q-th three-dimensional detection frame in the p-th frame,/or->Representing the length of the q-th three-dimensional detection frame in the p-th frame,/for the frame>Representing the height of the q-th three-dimensional detection frame in the p-th frame; selecting a two-dimensional detection frame which is closest to the laser radar in the three-dimensional detection frames>Characterizing the q-th target after clustering in the p-th frame; thereby obtaining a point cloud data set with a detection frame;
C. constructing a yolov5 model by adopting a convolution attention module, training the yolov5 model by utilizing a road vehicle image data set to obtain a trained yolov5 model, processing each frame of image information acquired by the camera by utilizing the trained yolov5 model, and outputting a detection frame of each target in each frame of image information and coordinate, size, category and confidence information thereof, thereby obtaining an image information set with the detection frame;
D. performing space-time synchronization on the point cloud data set and the image information set, including: the method comprises the steps of taking a laser radar signal as a reference of registration frequency, aligning time stamps of the laser radar and a camera by using an interpolation method, and thus obtaining a point cloud information set of the laser radar and an image information set of the camera at the same moment; then calibrating the camera to obtain an internal reference of the camera, calibrating the camera and the laser radar in a combined way, and obtaining an external reference, so that a two-dimensional detection frame under a laser radar coordinate system is projected under a pixel coordinate system, and a projected two-dimensional detection frame is obtainedWherein (1)>X-axis coordinate representing center point of two-dimensional detection frame after the q-th projection, +.>Representing the y-axis coordinate of the center point of the two-dimensional detection frame after the q-th projection, +.>Represents the width of the two-dimensional detection frame after the q-th projection,/->Representing the height of the two-dimensional detection frame after the q-th projection;
E. carrying out data association on the information after time-space synchronization to obtain an association pair:
e1, setting the association threshold as r th The method comprises the steps of carrying out a first treatment on the surface of the Defining a variable i to represent the number of frames of the laser radar after time synchronization with the camera, defining a variable j to represent the current target number contained in the point cloud data of the laser radar for observing the ith frame, defining a variable k to represent the current target number contained in the image data of the camera for observing the ith frame, and initializing i=1;
e2, initializing j=1; taking the coordinate and size information of the j-th projected two-dimensional detection frame in the point cloud data set of the i-th frame laser radar as the j-th radar target observation information of the i-th frameThe jth radar target observation information of the ith frame +.>The corresponding three-dimensional detection frame is used as the j-th three-dimensional detection frame after the i-th frame is clustered
E3, initializing k=1; taking the coordinates, the size, the category and the confidence information of the kth detection frame in the image information set of the ith frame camera as the kth camera target observation information of the ith frameWherein (1)>X-axis coordinate representing the center point of the kth detection frame,/->Representing the y-axis coordinates of the center point of the kth detection frame,/>represents the width of the kth detection frame, +.>Indicating the height of the kth detection box, < >>Class of detected target of kth detection frame,/->Confidence information representing a kth detection frame;
e4, calculating the j-th laser radar target observation information of the i-th frameThe kth camera object observation information +.>Euclidean distance between->
E5, judgingIf so, the detection target of the laser radar is successfully matched with the detection target of the camera, and the j-th radar target observation information of the i-th frame is +.>The kth camera object observation information +.>Between which isAssociation pairs, otherwise, representing failure of matching;
e6, after k+1 is assigned to k, returning to the step E3 for sequential execution until all camera target observation information of the ith frame is traversed, after j+1 is assigned to j, returning to the step E2 until all targets of the ith frame are traversed;
e7, calculating the j-th radar target observation information of the i-th frameThe kth camera object observation information +.>Cross ratio of the correlation pair>And with the set cross-over threshold IOU th Comparing if->The corresponding association pair in the ith frame is indicated to be correct and output, otherwise, the corresponding association pair in the ith frame is omitted, E7 is returned to calculate the next association pair in the ith frame until all correct association pairs in the ith frame are output;
F. data fusion is carried out on all correct association pairs in the ith frame to obtain target detection information after the fusion of the ith frame, wherein the target detection information comprises: if the mth radar target observation information of the ith frameWith the nth camera object observation informationThe relation is a correlation pair, and the x-axis coordinate in the three-dimensional detection frame corresponding to the mth radar target observation information of the ith frame is directly added>y-axis coordinate->z-axis coordinate +.>Length->Width->And class +.f in nth camera object observation information>Confidence information->Directly serving as fused partial target detection information of the corresponding association pair, and then converting an nth camera target observation frame into a radar coordinate system by utilizing the participation of the camera in the step D and the participation of the camera in the step D, thereby obtaining an nth camera target observation height +.>Projection in a radar coordinate system +.>And as the target detection height compensation information after fusion of the corresponding association pair, the target detection information after fusion is formed by partial target detection information after fusion and target detection height compensation information;
G. tracking each target in the target detection information fused in the ith frame and outputting a target detection result.
2. The lidar-based system of claim 1The method for detecting the target integrated with the machine vision is characterized in that in E5, if the j-th radar target observation information of the i-th frameObservation information of any one camera object from the ith frame +.>Euclidean distance between->Are all greater than r th Then j radar target observation information of i frame +.>Outputting and carrying out target tracking;
if the corresponding radar target observation information is detected in the (i+1) th frameAnd corresponding radar target observation information +.>The kth camera target observation information with the (i+1) th frameEuclidean distance between->Then consider the jth radar target observation information +.>The target was successfully detected.
CN202210630026.9A 2022-06-06 2022-06-06 Target detection method based on laser radar and machine vision fusion Active CN115032651B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210630026.9A CN115032651B (en) 2022-06-06 2022-06-06 Target detection method based on laser radar and machine vision fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210630026.9A CN115032651B (en) 2022-06-06 2022-06-06 Target detection method based on laser radar and machine vision fusion

Publications (2)

Publication Number Publication Date
CN115032651A CN115032651A (en) 2022-09-09
CN115032651B true CN115032651B (en) 2024-04-09

Family

ID=83123484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210630026.9A Active CN115032651B (en) 2022-06-06 2022-06-06 Target detection method based on laser radar and machine vision fusion

Country Status (1)

Country Link
CN (1) CN115032651B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114137562B (en) * 2021-11-30 2024-04-12 合肥工业大学智能制造技术研究院 Multi-target tracking method based on improved global nearest neighbor
CN115184917B (en) * 2022-09-13 2023-03-10 湖南华诺星空电子技术有限公司 Regional target tracking method integrating millimeter wave radar and camera
CN115236656B (en) * 2022-09-22 2022-12-06 中国电子科技集团公司第十研究所 Multi-source sensor target association method, equipment and medium for airplane obstacle avoidance
CN115571290B (en) * 2022-11-09 2023-06-13 传仁信息科技(南京)有限公司 Automatic ship draft detection system and method
CN115598656B (en) * 2022-12-14 2023-06-09 成都运达科技股份有限公司 Obstacle detection method, device and system based on suspension track
CN116363623B (en) * 2023-01-28 2023-10-20 苏州飞搜科技有限公司 Vehicle detection method based on millimeter wave radar and vision fusion
CN116030200B (en) * 2023-03-27 2023-06-13 武汉零点视觉数字科技有限公司 Scene reconstruction method and device based on visual fusion
CN116304992A (en) * 2023-05-22 2023-06-23 智道网联科技(北京)有限公司 Sensor time difference determining method, device, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111352112A (en) * 2020-05-08 2020-06-30 泉州装备制造研究所 Target detection method based on vision, laser radar and millimeter wave radar
CN114137562A (en) * 2021-11-30 2022-03-04 合肥工业大学智能制造技术研究院 Multi-target tracking method based on improved global nearest neighbor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11393097B2 (en) * 2019-01-08 2022-07-19 Qualcomm Incorporated Using light detection and ranging (LIDAR) to train camera and imaging radar deep learning networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111352112A (en) * 2020-05-08 2020-06-30 泉州装备制造研究所 Target detection method based on vision, laser radar and millimeter wave radar
CN114137562A (en) * 2021-11-30 2022-03-04 合肥工业大学智能制造技术研究院 Multi-target tracking method based on improved global nearest neighbor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Reserch on Unmanned Surface Vehicles Environment Perception Based on the Fusion of Vision and Lidar;WEI ZHAGN 等;IEEE Access;20210503;第9卷;63107-63121 *
基于信息融合的城市自主车辆实时目标识别;薛培林;吴愿;殷国栋;刘帅鹏;林乙蘅;黄文涵;张云;;机械工程学报;20201231(12);183-191 *
基于毫米波雷达与机器视觉融合的车辆检测技术研究;宋伟杰;中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑 (月刊);20210215;26-66 *

Also Published As

Publication number Publication date
CN115032651A (en) 2022-09-09

Similar Documents

Publication Publication Date Title
CN115032651B (en) Target detection method based on laser radar and machine vision fusion
CN110942449B (en) Vehicle detection method based on laser and vision fusion
CN110032949B (en) Target detection and positioning method based on lightweight convolutional neural network
CN111882612B (en) Vehicle multi-scale positioning method based on three-dimensional laser detection lane line
CN110859044B (en) Integrated sensor calibration in natural scenes
CN110531376B (en) Obstacle detection and tracking method for port unmanned vehicle
CN111060924B (en) SLAM and target tracking method
CN110569704A (en) Multi-strategy self-adaptive lane line detection method based on stereoscopic vision
CN102194239B (en) For the treatment of the method and system of view data
CN110197173B (en) Road edge detection method based on binocular vision
CN113516664A (en) Visual SLAM method based on semantic segmentation dynamic points
CN110992424B (en) Positioning method and system based on binocular vision
Huang et al. Tightly-coupled LIDAR and computer vision integration for vehicle detection
CN115468567A (en) Cross-country environment-oriented laser vision fusion SLAM method
CN111723778B (en) Vehicle distance measuring system and method based on MobileNet-SSD
CN113920183A (en) Monocular vision-based vehicle front obstacle distance measurement method
CN112150448A (en) Image processing method, device and equipment and storage medium
CN114200442A (en) Road target detection and correlation method based on millimeter wave radar and vision
Ortigosa et al. Obstacle-free pathway detection by means of depth maps
CN110864670B (en) Method and system for acquiring position of target obstacle
CN111539278A (en) Detection method and system for target vehicle
WO2020113425A1 (en) Systems and methods for constructing high-definition map
CN113706599B (en) Binocular depth estimation method based on pseudo label fusion
Pfeiffer et al. Ground truth evaluation of the Stixel representation using laser scanners
Huang et al. A coarse-to-fine LiDar-based SLAM with dynamic object removal in dense urban areas

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant