CN111292369B

CN111292369B - False point cloud data generation method of laser radar

Info

Publication number: CN111292369B
Application number: CN202010163524.8A
Authority: CN
Inventors: 薛松; 王曙; 李德祥; 李文正; 李乾; 林春雨
Original assignee: CRRC Qingdao Sifang Rolling Stock Research Institute Co Ltd
Current assignee: CRRC Qingdao Sifang Rolling Stock Research Institute Co Ltd
Priority date: 2020-03-10
Filing date: 2020-03-10
Publication date: 2023-04-28
Anticipated expiration: 2040-03-10
Also published as: CN111292369A

Abstract

The invention relates to a false point cloud data generation method of a laser radar, which comprises the following steps: the processor selects a first color image, a second color image and a third color image; based on the time data, acquiring front and rear two frames of sparse depth images corresponding to the time data of the second color image, wherein the front and rear two frames of sparse depth images are respectively a first sparse depth image and a second sparse depth image; determining an optical flow according to the offset displacement of the selected first pixel point; obtaining an intermediate frame sparse depth image according to the optical flow, the first sparse depth image and the second sparse depth image; taking the first sparse depth image, the second sparse depth image and the intermediate frame sparse depth image as inputs of a convolutional neural network, and deducing and outputting an intermediate frame dense depth image; taking the second color image and the intermediate frame dense depth image as the input of the U-net network, deducing and outputting the intermediate frame dense depth image; and performing coordinate conversion processing according to the pixel points in the intermediate frame fine depth image to obtain pseudo point cloud data of the intermediate frame fine depth image.

Description

False point cloud data generation method of laser radar

Technical Field

The invention relates to the technical field of laser radars, in particular to a false point cloud data generation method of a laser radar.

Background

In recent years, the unmanned automobile market develops hot, after google, the main unmanned automobile research and development team such as hundred degrees and Uber all take a laser radar as one of the most important sensors, and the laser radar is combined with a camera to realize accurate judgment of information such as obstacles in the unmanned process.

However, lidar has the following drawbacks: (1) The data acquired by the laser radar are very sparse, and only about 4% of images in the 2D depth map obtained by projecting the acquired point cloud data have pixel values; (2) The scanning frequency of lidar is low, and the frequency of a typical lidar sensor is about 10Hz. The highest scanning frequency in the market is 64-line three-dimensional laser radar of Velodyne company, and the highest scanning frequency can only reach 15Hz.

Whereas the frequency of the camera, which is typically combined with a lidar, is above 120Hz, the frequency of the camera needs to be reduced in order to synchronize the lidar with the camera. Reducing the frequency of the camera not only greatly wastes video information resources, but also is difficult to reach the minimum frequency standard of 25Hz for real-time monitoring in the unmanned automobile. Therefore, the point cloud data of the laser radar scanning is increased through an interpolation technology, and the frequency of the laser radar is 'improved', so that the method is a soft solution provided by the invention.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a false point cloud data generation method of a laser radar, which solves the problems of coarse granularity and inaccuracy in time and space of point cloud data generated by scanning of the laser radar with too low scanning frequency.

In order to achieve the above object, the present invention provides a method for generating pseudo point cloud data of a laser radar, the method comprising:

the processor selects a first color image, a second color image and a third color image from the database; wherein the first, second and third color images are color images of three consecutive frames; based on the time data, acquiring front and rear two frames of sparse depth images of the time data corresponding to the second color image, wherein the front and rear two frames of sparse depth images are respectively a first sparse depth image and a second sparse depth image;

establishing a corresponding relation between the first color image and the first sparse depth image and a corresponding relation between the third color image and the second sparse depth image;

scanning pixel points in the first color image and the third color image respectively, and determining the optical flow from the first color image to the second color image according to the offset displacement of the first pixel point in the selected first color image in the third color image; the first color image and the second color image comprise a plurality of pixel points; the pixel points have position information;

the processor performs intermediate frame prediction processing according to the optical flow, the pixel value of each pixel point in the first sparse depth image and the second sparse depth image to obtain an intermediate frame sparse depth image of an intermediate frame of the first sparse depth image and the second sparse depth image;

the processor takes the first sparse depth image, the second sparse depth image and the intermediate frame sparse depth image as inputs of a convolutional neural network, and derives and outputs an intermediate frame dense depth image of an intermediate frame through the convolutional neural network; the intermediate frame dense depth image has temporal data;

the processor takes the second color image and the intermediate frame dense depth image as the input of a U-net network, and deduces and outputs an intermediate frame fine depth image of an intermediate frame through the U-net network; the intermediate frame fine depth image comprises a plurality of pixel points;

and the processor performs coordinate conversion processing according to the pixel points in the intermediate frame fine depth image to obtain pseudo point cloud data of the intermediate frame fine depth image.

Preferably, the processor performs coordinate conversion processing according to pixel points in the intermediate frame fine depth image, and the obtaining of the point cloud data of the intermediate frame fine depth image specifically includes:

the processor obtains a coordinate corresponding relation according to a preset camera parameter and a world coordinate system, and obtains point cloud data of the intermediate frame fine depth image according to the coordinate corresponding relation and position information of pixel points in the intermediate frame fine depth image.

Preferably, before the processor selects the first color image, the second color image and the third color image from the database, the pseudo point cloud data generating method further includes:

the laser radar scans the region to be detected according to a preset scanning frequency to obtain three-dimensional radar echo original data, and the radar echo original data are sent to the processor; the radar echo raw data has a time sequence; the color camera performs image acquisition on the region to be detected according to a preset sampling frequency to obtain image data of multi-frame color images, and the image data are stored in the database; the sampling frequency is greater than the scanning frequency;

the processor sorts the three-dimensional radar echo original data according to the time sequence to obtain point cloud data, and stores the point cloud data into the database;

the processor performs mapping processing on the point cloud data to obtain two-dimensional image data;

the processor forms a frame of sparse depth image according to the two-dimensional image data obtained by single scanning and stores the sparse depth image into the database; the sparse depth image has corresponding temporal data.

Preferably, the scanning the pixel points in the first color image and the third color image respectively, and determining the optical flow from the first color image to the second color image according to the offset displacement of the first pixel point in the selected first color image in the third color image specifically includes:

the processor determines surrounding pixel points around the first pixel point according to the position information of the first pixel point and the preset maximum displacement, and forms a first pixel block according to the first pixel point and the surrounding pixel points; the first pixel block has size data;

the processor searches a second pixel point which is the same as the position information of a first pixel point in the first color image in the third color image, and performs pixel block matching processing according to the second pixel point and the size data to obtain a characteristic pixel block matched with the first pixel block;

the processor scans pixel points in the characteristic pixel block one by one, and obtains characteristic pixel points after the first pixel points are deviated according to the pixel values of the first pixel points;

the processor obtains an optical flow from the first color image to a second color image according to the position information of the first pixel point and the characteristic pixel point.

Preferably, the processor performs an intermediate frame prediction process according to the optical flow, the pixel value of each pixel point in the first sparse depth image and the second sparse depth image, and the intermediate frame sparse depth image of the intermediate frame of the first sparse depth image and the second sparse depth image is specifically obtained as follows:

the processor performs average calculation processing according to the pixel values of two pixel points with the same position information in the first sparse depth image and the second sparse depth image to obtain an average pixel value of the pixel points of the intermediate frame;

and the processor corrects the average pixel value according to the deviation value of each pixel point provided by the optical flow, and generates an intermediate frame sparse depth image of the intermediate frame according to a correction result.

According to the false point cloud data generation method of the laser radar, provided by the embodiment of the invention, the optical flow of the color image corresponding to the sparse depth map of the two continuous frames is utilized to replace the optical flow of the sparse depth map, the depth map of the intermediate frame of the two continuous frames is obtained through an interpolation method, interpolation method calculation can be further carried out on the basis of the obtained intermediate frame and the previous or next frame thereof, and the depth map of the intermediate frame between the intermediate frames is obtained, so that data obtained by scanning at higher frequency can be obtained under the condition that the actual laser radar scanning frequency is not changed, and the granularity and the precision of the point cloud data in time and space are improved.

Drawings

Fig. 1 is a schematic flow chart of a pseudo point cloud data generating method according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a pseudo point cloud data generating method illustrated by an image according to an embodiment of the present invention;

FIG. 3 is a graph showing the comparison of the results of the first viewing angle provided by the embodiment of the invention;

FIG. 4 is a graph showing the comparison of the results of the second viewing angle provided by the embodiment of the present invention;

FIG. 5 is a schematic diagram showing a comparison of (a) a true intermediate frame color image, (b) a sparse depth image, (c) a dense depth image, and (d) a predicted intermediate frame accurate depth image, provided by an embodiment of the present invention;

fig. 6 is a schematic diagram of comparison of (a) a color image of a real intermediate frame, (b) an image of point cloud data of a real lidar, and (c) an image of predicted pseudo point cloud data, provided by an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is further described in detail through the drawings and the embodiments.

According to the false point cloud data generation method of the laser radar, provided by the invention, the optical flow of the color image corresponding to the sparse depth map of the two continuous frames is utilized to replace the optical flow of the sparse depth map, the depth map of the intermediate frame of the two continuous frames is obtained through an interpolation method, interpolation method calculation can be further carried out on the basis of the obtained intermediate frame and the previous or next frame, and the depth map of the intermediate frame between the intermediate frames is obtained, so that the data obtained through higher frequency scanning can be obtained under the condition that the actual laser radar scanning frequency is not changed, and the granularity and the precision of the point cloud data in time and space are improved.

Fig. 1 is a schematic flow chart of a method for generating pseudo point cloud data according to an embodiment of the present invention, which illustrates a flow of generating pseudo point cloud data of a laser radar by processing point cloud data scanned by the laser radar and color images acquired by a camera. The following describes the technical scheme of the present invention in detail with reference to fig. 1.

To facilitate an understanding of the present invention, a depth image, which is an image or image channel containing information about the distance of the surface of a scene object of a viewpoint, is first explained. The depth map is similar to a gray scale image except that each of its pixel values represents the actual distance of the sensor from the object. In the sparse depth map in the invention, the sparsity and the dense in the dense depth map refer to the proportion of pixel points in the depth image with pixel values. Since the lidar acquisition data is very sparse and unordered, the depth image generated by the processor from the point cloud of the lidar is sparse.

Step 101, a laser radar scans a region to be detected according to a preset scanning frequency to obtain three-dimensional radar echo original data, and the radar echo original data is sent to a processor; the color camera performs image acquisition on the region to be detected according to a preset sampling frequency to obtain image data of multi-frame color images, and the image data are stored in a database;

specifically, the radar echo raw data has a time sequence, and the sampling frequency is greater than the scanning frequency. The laser radar and the color camera perform scanning work and sampling work simultaneously, and the scanned areas are the same.

Step 102, the processor sorts the three-dimensional radar echo original data according to the time sequence to obtain point cloud data, and stores the point cloud data in a database;

specifically, the radar echo raw data may be regarded as three-dimensional point data, and the point cloud data may be regarded as a matrix sequence composed of a plurality of three-dimensional point data.

Step 103, the processor performs mapping processing on the point cloud data to obtain two-dimensional image data;

specifically, the mapping process is to convert the three-dimensional coordinates of the three-dimensional point cloud data into two-dimensional image coordinates.

104, the processor forms a frame of sparse depth image according to the two-dimensional image data obtained by single scanning, and stores the sparse depth image into a database;

specifically, the sparse depth image has corresponding temporal data. The two-dimensional image data obtained by single scanning form a frame of sparse depth image, and the sparse depth image of each frame has corresponding time data.

To facilitate an understanding of the sparse depth image generation process, the following example is illustrated. For example, 32 stationary laser transmitters are used to measure the target area, each fixed laser transmitter being fixed at a particular vertical angle, and 32 groups of laser transmitters are fired once every interval Δt, the radar scans the target area by rotating at high speed, and each time the laser radar scans through 360 °, referred to as a frame. For the input data at the time t, the input data are a series of three-dimensional radar echo original data, and the time t is obtained according to the vertical angle of the laser radar transmitter from small to largeIs numbered 1-32. Thus, for the nth point P at time t _n,t Defining 4 adjacent points as P _n+1,t ,P _n,t-1 ,P _n,t+1 ,P _n-1,t . According to the principle, three-dimensional point cloud data of one frame are mapped into two-dimensional image data, and a sparse depth map at a moment t is formed by taking the horizontal depth of the three-dimensional point cloud data as a pixel value.

Step 105, the processor selects a first color image, a second color image and a third color image from the database;

specifically, the first color image, the second color image, and the third color image are color images of three consecutive frames.

Step 106, based on the time data, acquiring two frames of sparse depth images before and after the time data corresponding to the second color image, wherein the two frames of sparse depth images are a first sparse depth image and a second sparse depth image respectively;

specifically, because the frequencies of the laser radar and the camera are not consistent, the sparse depth image converted from the point cloud data generated by the laser radar and the time of the color image acquired by the camera are also not consistent. Here, a sparse depth image of a frame corresponding thereto is acquired based on time data of a color image, instead of a sparse depth image of time data corresponding thereto.

Step 107, establishing a corresponding relation between the first color image and the first sparse depth image, and a corresponding relation between the third color image and the second sparse depth image;

specifically, the establishment of the correspondence between the color images and the sparse depth images is equivalent to adjusting the time data of the sparse depth images according to the time data of the color images, so that each color image has a sparse depth image uniquely corresponding to the color image at the corresponding time.

Step 108, scanning the pixel points in the first color image and the third color image respectively, and determining the optical flow from the first color image to the second color image according to the offset displacement of the first pixel point in the selected first color image in the third color image;

specifically, the optical flow is the instantaneous speed of the pixel motion of a spatially moving object on the observation imaging plane, and can also be understood as the displacement of each pixel point (x, y) on the image during the image change process.

For example, the position of the point a at the time of the t frame color image is (x 1, y 1), the point a is found again in the t+1st frame color image, and the position thereof is (x 2, y 2), then we can determine the movement of the point a, and the optical flow (Δu, Δv) = (x 2, y 2) - (x 1, y 1). Where u, v are velocity vectors of the optical flow along the X-axis and the Y-axis, respectively. Since there is no optical flow calculation method of a depth map at present, the optical flow obtained using a color image replaces the optical flow of a depth image in the present embodiment.

The specific optical flow calculation method comprises the following steps:

the processor determines surrounding pixel points around the first pixel point according to the position information of the first pixel point and the preset maximum displacement, and forms a first pixel block according to the first pixel point and the surrounding pixel points; the processor searches a second pixel point which is the same as the position information of the first pixel point in the first color image in the third color image, and performs pixel block matching processing according to the second pixel point and the size data to obtain a characteristic pixel block matched with the first pixel block; the processor scans the pixel points in the characteristic pixel block one by one, and obtains characteristic pixel points after the first pixel point is deviated according to the pixel value of the first pixel point; the processor obtains an optical flow from the first color image to the second color image according to the position information of the first pixel point and the characteristic pixel point.

Wherein the first color image and the second color image comprise a plurality of pixel points; the pixel points have position information; the first pixel block has size data.

The above is a method of matching according to the pixel block and then traversing the pixel values of the pixel points in the feature pixel block, so as to obtain the optical flow. In practice, the method for obtaining the optical flow is not limited to the above one.

Step 109, the processor performs an intermediate frame prediction process according to the optical flow, the pixel value of each pixel point in the first sparse depth image and the second sparse depth image, so as to obtain an intermediate frame sparse depth image of the intermediate frame of the first sparse depth image and the second sparse depth image;

specifically, a pixel refers to a smallest unit in an image represented by a sequence of numbers, typically displayed as a stain. The pixel value is a value given by a computer when the image is digitized and represents the average luminance information of a small square of the image.

The processor performs offset processing on the first sparse depth image and the second sparse depth image according to the offset value of each pixel point provided by the optical flow, and generates an intermediate frame sparse depth image of the intermediate frame according to the offset result.

In a preferred scheme, the processor performs average calculation processing according to the pixel values of two pixel points with the same position information in the first sparse depth image and the second sparse depth image to obtain an average pixel value of the pixel points of the intermediate frame; the processor corrects the average pixel value according to the deviation value of each pixel point provided by the optical flow, and generates an intermediate frame sparse depth image of the intermediate frame according to the correction result.

Step 110, the processor takes the first sparse depth image, the second sparse depth image and the intermediate frame sparse depth image as the input of a convolutional neural network, and deduces and outputs an intermediate frame dense depth image of an intermediate frame through the convolutional neural network;

specifically, the intermediate frame dense depth image has temporal data. The convolutional neural network (Convolutional Neural Networks, CNN) is a feed-forward neural network, and artificial neurons can respond to surrounding units and can perform large-scale image processing. The convolutional neural network includes a convolutional layer and a pooling layer. The basic structure includes two layers, one of which is a feature extraction layer, the input of each neuron is connected to a local receptive field of the previous layer, and the local features are extracted. Once the local feature is extracted, the positional relationship between the other features is also determined; and the second is a feature mapping layer, each calculation layer of the network consists of a plurality of feature maps, each feature map is a plane, weights of all neurons on the plane are equal, and the feature maps have displacement invariance.

The processor takes the first sparse depth image, the second sparse depth image and the intermediate frame sparse depth image as inputs of a convolutional neural network, the convolutional neural network respectively performs feature extraction and feature stitching on the first sparse depth image, the second sparse depth image and the intermediate frame sparse depth image, and finally outputs the intermediate frame dense depth image of the intermediate frame.

Step 111, the processor uses the second color image and the intermediate frame dense depth image as the input of the U-net network, and deduces and outputs the intermediate frame fine depth image of the intermediate frame through the U-net network;

specifically, the intermediate frame fine depth image includes a plurality of pixel points. The U-net network is a variant of the full convolution network (Fully Convolutional Networks, FCN), in which deep information and shallow information are fused by corresponding pixel addition, and U-net is by stitching. FCNs differ from CNNs in that the last fully connected layer with CNNs is replaced by a convolutional layer. In order to combine local and global information, the U-net network in this embodiment contains a plurality of hopping connections. And obtaining an accurate depth map of the intermediate frame through the sparse depth map of the adjacent frame and the corresponding color image. The mid-frame accurate depth map is shown in the last column of fig. 3.

And step 112, the processor performs coordinate conversion processing according to the pixel points in the intermediate frame fine depth image to obtain pseudo point cloud data of the intermediate frame fine depth image.

Specifically, the processor obtains a coordinate corresponding relation according to a preset camera parameter and a world coordinate system, and obtains point cloud data of the intermediate frame fine depth image according to the coordinate corresponding relation and position information of pixel points in the intermediate frame fine depth image.

According to the pinhole camera model principle, pixel point coordinates (u, v, d) of a space point (x, y, z) and a depth image corresponding to the space point are obtained, wherein u, v is any coordinate point under an image coordinate system, and d is a depth value. The corresponding relation is as follows:

z＝d _t (u,v)

d _t (u, v) represents a depth value corresponding to the pixel point (u, v) in the depth map of the current frame. Because one camera cannot determine world coordinates, two cameras are used for combination, and space points, namely pseudo point cloud data, are obtained through an intermediate frame accurate depth map of the two cameras.

According to preset camera parameters c of two cameras _u ，c _v ，f _u ，f _v The processor converts the predicted accurate depth map into space point cloud coordinates, and converts all pixel points in the accurate depth map into three-dimensional coordinates to obtain a group of points (x _i ,y _i ,z _i ). The set of points are pseudo point cloud data generated after the processing of the invention.

Wherein c _u ，c _v Is the principal point offset, f _u ，f _v Is the focal length of the camera. The principal axis of the camera is a line perpendicular to the image plane and passing through the vacuum, and its focal point with the image plane is called the principal point. The principal point offset is the position of the principal point relative to the image plane.

The pseudo point cloud data generation method will be described as an image with reference to fig. 2 to 6.

Fig. 2 is a schematic diagram of a pseudo point cloud data generating method illustrated by an image according to an embodiment of the present invention. The first row is a continuous three-frame color image, which is a t-1 frame color image, a t frame color image and a t+1 frame color image, respectively. The second behavior is respectively corresponding to the t-1 frame sparse depth image and the t+1 frame sparse depth image of the t-1 frame color image and the t+1 frame color image. The second behavior is a t-frame sparse depth image obtained through a t-frame color image, a t-1-frame sparse depth image, a t+1-frame sparse depth image and an optical flow. The fourth line shows pseudo point cloud data of the t frame predicted by using the pseudo point cloud data generation method of the invention.

According to the false point cloud data generation method of the laser radar, three color images of continuous frames and sparse depth images of two corresponding continuous frames are taken as inputs, an intermediate frame dense depth image is output, and then the processed intermediate frame dense depth image is converted into false point cloud data by using known parameters of a camera.

Fig. 3 is a comparison chart of the results of the first viewing angle provided by the embodiment of the invention. Fig. 4 is a comparison chart of the results of the second viewing angle provided by the embodiment of the invention. As is apparent from fig. 3 and 4, the depth map corresponding to the pseudo point cloud data has a clearer outline and boundary compared with the point cloud data of the real radar. The method utilizes an interpolation algorithm to improve the point cloud frame rate of the laser radar at least one time, and greatly improves the quantity and information quantity of the point clouds of the laser radar.

Fig. 5 is a schematic diagram of comparison of (a) a color image of a real intermediate frame, (b) a sparse depth image, (c) a dense depth image, and (d) a predicted intermediate frame accurate depth image, provided by an embodiment of the present invention. The color image, the sparse depth image, the dense depth image and the precise depth image of the intermediate frame predicted by the method are respectively shown from left to right.

Fig. 6 is a schematic diagram of comparison of (a) a color image of a real intermediate frame, (b) an image of point cloud data of a real lidar, and (c) an image of predicted pseudo point cloud data, provided by an embodiment of the present invention. From left to right, are a color image of a real intermediate frame, an image of point cloud data of a real lidar, and an image of pseudo point cloud data we predict, respectively. In the figure, the box areas mark some vehicles and people, and by comparison, it is obvious that the automobiles, people and the like in the image processed by the method have clearer outlines and boundaries.

In the field of autopilot, the processor performs obstacle judgment and other decisions based on visual information composed of continuous pseudo-point cloud data. By adopting the pseudo point cloud data generation method, the pseudo point cloud data of the laser radar with the sampling frequency consistent with that of the camera can be generated, so that the real-time performance of obstacle detection in automatic driving is supported.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of function in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims

1. The false point cloud data generation method of the laser radar is characterized by comprising the following steps of:

2. The method for generating pseudo point cloud data of a lidar according to claim 1, wherein the processor performs coordinate conversion processing according to pixel points in the intermediate frame fine depth image, and the obtaining of the point cloud data of the intermediate frame fine depth image specifically includes:

3. The method for generating pseudo point cloud data for a lidar according to claim 1, wherein before the processor selects the first color image, the second color image, and the third color image from the database, the method for generating pseudo point cloud data further comprises:

4. The method for generating pseudo point cloud data of a lidar according to claim 1, wherein the scanning of the pixels in the first color image and the third color image respectively, and determining the optical flow from the first color image to the second color image based on the offset displacement of the first pixel in the selected first color image in the third color image, is specifically:

5. The method for generating pseudo point cloud data of a lidar according to claim 1, wherein the processor performs an intermediate frame prediction process according to the optical flow, the pixel value of each pixel point in the first sparse depth image and the second sparse depth image, and the intermediate frame sparse depth image of the intermediate frame of the first sparse depth image and the second sparse depth image is specifically: