CN115909274A

CN115909274A - Automatic driving-oriented dynamic obstacle detection method

Info

Publication number: CN115909274A
Application number: CN202211502567.XA
Authority: CN
Inventors: 邓天民; 吴勇军; 杨令; 张曦月
Original assignee: Chongqing Jiaotong University
Current assignee: Chongqing Jiaotong University
Priority date: 2022-11-28
Filing date: 2022-11-28
Publication date: 2023-04-04

Abstract

The invention relates to a dynamic obstacle detection method for automatic driving, and belongs to the technical field of automatic driving. The method combines two-dimensional image data and three-dimensional point cloud data, utilizes a deep learning algorithm, and realizes three-dimensional obstacle detection through comparison and fusion of detection results of the two data. And processing the two-dimensional image acquired by the camera by utilizing a semantic segmentation network, extracting data of the lane and the obstacle, and extracting the relative position of the obstacle on the image. And then converting the image coordinates into world coordinates. And simultaneously, detecting the obstacle by using the point cloud data through a three-dimensional (D) target detection network, extracting the relative position of the obstacle and the laser radar, and calculating the position of the obstacle on the coordinates of the laser scanner. And finally, performing rotational translation on the coordinates of the laser scanner, comparing the coordinates with world coordinates, and matching the detection results of the two data. The invention can reduce redundant data, reduce detection time and improve detection precision.

Description

Automatic driving-oriented dynamic obstacle detection method

Technical Field

The invention belongs to the technical field of automatic driving, and relates to a dynamic obstacle detection method for automatic driving.

Background

In the field of automatic driving technology, road obstacle detection is a very important technology. The intelligent vehicle generally senses the surrounding environment of the vehicle through vehicle-mounted sensors such as a camera, a radar (such as a laser radar, a millimeter wave radar and the like), a GPS (global positioning system), an IMU (inertial measurement unit) and the like, and then detects obstacles on the driving path of the vehicle by adopting a corresponding obstacle detection algorithm, so as to control the vehicle to take corresponding measures to avoid the obstacles. When the laser radar is used for detecting the obstacles, the laser radar emits laser beams and constructs Point Cloud (Point Cloud) according to reflected beams formed by the laser beams on the surface of an object, so that the obstacles in the environment are identified according to the Point Cloud; when the camera is used for detecting the obstacles, the camera collects an environment image and identifies the obstacles in the environment image through an image identification technology.

However, the detection of obstacles is limited by the operating principles of the laser radar and the camera, and only a small range of obstacle detection can be realized by using the laser radar, and the detection error of the obstacle distance is large when the obstacle is recognized by using the camera, which affects the accuracy of the obstacle detection.

In the aspect of detecting obstacles according to acquired data, the three-dimensional point cloud data acquired by the laser radar is high in precision, but the point cloud is sparse, and the sparse point cloud is difficult to identify a three-dimensional space target. Although accurate target recognition can be realized by combining two-dimensional image data acquired by a monocular camera with a deep learning algorithm, target positioning in a three-dimensional space cannot be realized due to the lack of three-dimensional position information of a two-dimensional image; although images acquired by the binocular camera can be used for calculating dense three-dimensional point cloud through parallax to realize three-dimensional space target identification and positioning, due to the complexity of parallax calculation, the existing parallax calculation algorithm is too slow in calculation speed or insufficient in accuracy. Through the classified and clustered point cloud, the regression bounding box is still inaccurate due to the sparsity of the point cloud. Therefore, the obstacle detection is carried out by combining the two-dimensional image information and the three-dimensional point cloud information, which is beneficial to realizing the automatic obstacle avoidance of the intelligent vehicle.

Disclosure of Invention

In view of this, the present invention provides a dynamic obstacle detection method, which matches the two-dimensional image data of a camera with the detection result of the three-dimensional point cloud data of a laser radar, so as to achieve the effects of reducing redundant data, reducing detection time, and improving detection accuracy.

In order to achieve the purpose, the invention provides the following technical scheme:

a dynamic obstacle detection method for automatic driving combines two-dimensional image data and three-dimensional point cloud data, and utilizes a deep learning algorithm to compare and fuse detection results of the two-dimensional image data and the three-dimensional point cloud data so as to realize obstacle detection, and the method specifically comprises the following steps:

s1, collecting a two-dimensional image and a three-dimensional point cloud image of a road;

s2, processing the two-dimensional image by using a semantic segmentation network, extracting lane data, barrier data and the relative position of a barrier on the image, and converting the image coordinate into a world coordinate;

s3, detecting the obstacle of the three-dimensional point cloud image by using a 3D target detection network, and calculating the position of the obstacle on the coordinates of the laser scanner to obtain the coordinate value of the obstacle;

and S4, mapping the world coordinate obtained in the step S2 and the coordinate value obtained in the step S3 to a YOZ plane coordinate, matching, and outputting three-dimensional information of the obstacle if matching is successful.

Further, in step S2, the image processing procedure is: inputting the two-dimensional image into a convolutional neural network, extracting features through a convolutional layer and a pooling layer, and outputting a segmentation image with the same size as the original image through upsampling.

In step S2, the coordinate transformation process is specifically as follows:

1) The obstacle is pixel coordinates on the image, so it is first converted into image coordinates as shown in the following equation:

wherein u and v represent coordinate values of the obstacle in pixel coordinates, and u represents the coordinate value of the obstacle in pixel coordinates ₀ 、v ₀ Respectively representing coordinate values of the center point of the picture in pixel coordinates, and dx and dy respectively representing scale values of each row and each column by taking millimeters as a unit;

2) The image coordinates are converted to camera coordinates as shown in the following equation:

in the formula, X _c 、Y _c 、Z _c Respectively representing coordinate values of obstacles in camera coordinates, and f represents a camera focal length;

3) The camera coordinates are rotated and translated into world coordinates as shown in the following equation:

in the formula, X _w 、Y _w And Z _w And the coordinate values of the obstacles in the world coordinates are respectively represented, R represents a rotation matrix, and T represents a translation matrix.

Further, in step S3, the process of calculating the obstacle coordinate value is as follows:

1) Extracting the distance rho between the obstacle and the laser radar and the included angle theta and phi

2) Calculating the coordinate value of the obstacle according to the distance and the included angle, as shown in the following formula:

in the formula, X _socs 、Y _scos And Z _socs Respectively, coordinate values of the obstacle in the laser scanner coordinates.

Further, in step S4, the three-dimensional information of the obstacle finally output is specifically three-dimensional information of the obstacle obtained by the 3D object detection network.

The invention has the beneficial effects that: according to the invention, the detection results of the two-dimensional image data and the three-dimensional point cloud data are compared and fused by adopting a deep learning algorithm, so that the three-dimensional obstacle detection is realized, the redundant data can be reduced, the detection time is shortened, and the detection precision is improved.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof.

Drawings

For a better understanding of the objects, aspects and advantages of the present invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:

FIG. 1 is a flow chart of a dynamic obstacle detection method;

FIG. 2 is a schematic diagram of coordinate transformation and contrast fusion.

Detailed Description

The following embodiments of the present invention are provided by way of specific examples, and other advantages and effects of the present invention will be readily apparent to those skilled in the art from the disclosure herein. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.

As shown in fig. 1 and 2, the method for detecting a dynamic obstacle for automatic driving combines two-dimensional image data and three-dimensional point cloud data, and implements detection of a three-dimensional obstacle by comparing and fusing detection results of the two data by using a deep learning algorithm. The method comprises the steps that a semantic segmentation network is utilized to process two-dimensional images collected by a camera, lane data and obstacle data are extracted, the relative positions of obstacles on the images are extracted, and then image coordinates are converted into world coordinates; and (3) detecting obstacles by using a 3D target detection network on the three-dimensional point cloud data acquired by the laser radar, extracting the relative positions of the obstacles and the laser radar, and calculating the positions of the obstacles on the coordinates of the laser scanner. And finally, performing rotation translation on the coordinates of the laser scanner, comparing the coordinates with world coordinates, matching the detection results of the two-dimensional image data and the three-dimensional point cloud data, and outputting the three-dimensional information of the obstacle through a 3D target detection network if the matching is successful.

For the acquisition and processing of two-dimensional image data, the specific process is as follows:

the method comprises the steps of firstly, acquiring an image right in front of a vehicle by using a camera, taking the image as input, and utilizing a semantic segmentation network based on deep learning to segment lanes and barriers. The input image passes through a convolutional neural network, the features of the input image are extracted through a convolutional layer and a pooling layer, then the up-sampling is carried out, a segmentation image with the same size as the original image is obtained, and the segmentation image is output.

When an obstacle is detected, the position of the obstacle in the image is extracted, and then coordinate conversion is performed. The obstacle is a pixel coordinate on the image, and needs to be converted into an image coordinate, as shown in the following formula:

wherein u and v represent coordinate values of the obstacle in pixel coordinates, respectively, and u represents a coordinate value of the obstacle in pixel coordinates ₀ 、v ₀ Coordinate values of the center point of the picture in pixel coordinates are respectively represented, and dx and dy respectively represent scale values of each row and each column, and take millimeter as a unit.

After the position of the obstacle in the image coordinate is obtained, the image coordinate is converted into a camera coordinate as shown in the following formula:

in the formula, X _c 、Y _c 、Z _c And f represents the coordinate value of the obstacle in the camera coordinate, and f represents the camera focal length, respectively. On the basis, the coordinate value of the obstacle in the world coordinate can be obtained after the camera coordinate is rotated and translated, and the coordinate value is shown as the following formula:

in the formula, X _w 、Y _w And Z _w Coordinate values respectively representing obstacles in world coordinates; r represents a 1 × 3 rotation matrix, and is obtained by multiplying rotation matrices R1, R2 and R3 around X, Y, Z triaxial; t represents a 3 × 1 translation matrix;

a matrix in which 3 × 3 elements are all 0 is shown.

The acquisition and processing of the three-dimensional point cloud data are as follows:

acquiring point cloud data right in front of a vehicle by using a laser radar, processing the point cloud data by using a 3D target detection network, detecting obstacles, and extracting the distance rho between the obstacles and the laser radar and the included angle theta and theta

After the distance and included angle values are obtained, the coordinate value of the obstacle under the coordinate of the laser scanner is calculated according to the following formula:

And after coordinate values of the obstacles in the coordinates of the laser scanner are obtained, mapping the obstacle information in the coordinates of the laser scanner to YOZ plane coordinates, mapping the obstacle information in the world coordinates to the YOZ plane coordinates, matching detection results of the two data, and outputting three-dimensional information of the obstacles obtained by the 3D target detection network if matching is successful.

Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims

1. A dynamic obstacle detection method for automatic driving is characterized in that: the method combines two-dimensional image data and three-dimensional point cloud data, and utilizes a deep learning algorithm to compare and fuse detection results of the two-dimensional image data and the three-dimensional point cloud data so as to realize obstacle detection, and the method specifically comprises the following steps:

and S4, mapping the world coordinate obtained in the step S2 and the coordinate value obtained in the step S3 to YOZ plane coordinates, matching, and outputting three-dimensional information of the obstacle if matching is successful.

2. The obstacle detection method according to claim 1, characterized in that: in step S2, the image processing process includes: and inputting the two-dimensional image into a convolutional neural network, extracting characteristics through a convolutional layer and a pooling layer, performing upsampling, and finally outputting a segmentation image with the same size as the original image.

3. The obstacle detection method according to claim 1, characterized in that: in step S2, the coordinate transformation process is specifically as follows:

wherein u and v represent coordinate values of the obstacle in pixel coordinates, respectively, and u represents a coordinate value of the obstacle in pixel coordinates ₀ 、v ₀ Respectively representing coordinate values of the center point of the picture in pixel coordinates, and respectively representing scale values of each column and each row by dx and dy, wherein the scale values are in millimeter units;

in the formula, X _c 、Y _c 、Z _c Respectively representing coordinate values of obstacles in the camera coordinates, and f represents the focal length of the camera;

in the formula, X _w 、Y _w And Z _w Coordinate values representing an obstacle in world coordinates, R representing a rotation matrix, T representing a translation matrix,

representing a matrix with elements 0.

4. The obstacle detection method according to claim 1, characterized in that: in step S3, the process of calculating the obstacle coordinate values is as follows:

1) Extracting the distance rho between the obstacle and the laser radar and the included angle theta and

5. The obstacle detection method according to claim 1, characterized in that: in step S4, the three-dimensional information of the obstacle is a distance and an included angle between the obstacle and the laser radar obtained by the 3D target detection network.