CN113269040A - Driving environment sensing method combining image recognition and laser radar point cloud segmentation - Google Patents

Driving environment sensing method combining image recognition and laser radar point cloud segmentation Download PDF

Info

Publication number
CN113269040A
CN113269040A CN202110445391.8A CN202110445391A CN113269040A CN 113269040 A CN113269040 A CN 113269040A CN 202110445391 A CN202110445391 A CN 202110445391A CN 113269040 A CN113269040 A CN 113269040A
Authority
CN
China
Prior art keywords
point cloud
image
laser radar
category
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110445391.8A
Other languages
Chinese (zh)
Inventor
俞扬
詹德川
周志华
余德丛
袁雷
余峰
黄军富
陈雄辉
张云天
庞竟成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202110445391.8A priority Critical patent/CN113269040A/en
Publication of CN113269040A publication Critical patent/CN113269040A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Abstract

The invention discloses a driving environment perception method combining image recognition and laser radar point cloud segmentation, which comprises the following steps: (1) and collecting ground laser radar point cloud data and image data on the real road. (2) And using the collected image data as a reference, calibrating the laser radar point cloud data and the image data, and marking the collected laser radar point cloud data. (3) Initializing a point cloud segmentation network, training marked laser radar point cloud data, and updating network parameters. (4) And transplanting the trained network into an unmanned vehicle manual control machine to obtain the category of the object to which the point cloud belongs. (5) The image data is identified. (6) And fusing the segmented laser radar point cloud data and the image data after image recognition to obtain the accurate positions of the road and the object. The invention senses the environment in real time and overcomes the defect of poor recognition effect of image recognition under the conditions of bad weather and poor light.

Description

Driving environment sensing method combining image recognition and laser radar point cloud segmentation
Technical Field
The invention relates to a driving environment perception method combining image recognition and laser radar point cloud segmentation, and belongs to the technical field of unmanned driving environment perception.
Background
With the falling of artificial intelligence application technology to the fields of voice recognition, recommendation systems, intelligent robots and the like, the demand of various social circles on the application of the artificial intelligence technology to the field of unmanned driving is more and more urgent. The premise behind the implementation of unmanned technology is the need to be able to accurately perceive and identify drivable areas and objects to the surrounding environment "like a person". Conventional methods mainly use image recognition to identify drivable areas and objects. In the aspect of recognizing drivable areas and objects, although the image recognition has the characteristics of high recognition speed and good recognition effect, the image recognition has the defect of poor recognition effect in the case of recognizing distant objects and poor weather and light.
In recent years, both academic and industrial industries have begun to attempt environmental awareness using lidar. The working principle of the laser radar is that a laser beam is emitted to the surrounding environment, the laser beam returns when encountering an obstacle, and information such as the distance of a target and the reflection intensity is calculated by calculating the time difference between the emission and the return of the laser beam. Therefore, the laser radar does not influence the environment perception by weather and light, has stable output, and becomes necessary hardware for an environment perception module of a plurality of unmanned companies.
The existing laser radar point cloud segmentation algorithm has several disadvantages: 1. lack of labeled data; 2. the network structure is large, and the operation time is too long; 3. spatial structure information between the point clouds is lost.
Disclosure of Invention
The purpose of the invention is as follows: the invention provides a driving environment perception method for solving the defects of poor anti-jamming capability of image recognition and poor effect of recognizing distant objects in the unmanned driving environment perception process, which relates to a driving environment perception method combining image recognition and laser radar point cloud segmentation. The algorithm has the advantages of high identification accuracy, good identification effect and stable identification effect.
In addition, an image recognition algorithm and a RangeNet + + laser radar point cloud segmentation algorithm are combined, the point cloud is segmented by the RangeNet + + algorithm, image recognition is carried out by a camera, recognition results of the two are fused, and the defect of image recognition is overcome by laser radar point cloud segmentation.
The technical scheme is as follows: a driving environment perception method combining image recognition and laser radar point cloud segmentation is characterized by collecting laser radar point cloud data in the manual driving process of an unmanned vehicle, carrying out category marking, using a deep learning network to train a classification model, and judging the category of the point cloud by the network according to information of the point cloud such as (x, y, z, permission) and the like. After the network is trained, the network is transplanted to an unmanned vehicle with a laser radar, a camera and an industrial personal computer, the network is loaded in the unmanned process, surrounding environment point cloud and image data are collected in real time, the type of the point cloud is judged by using the network, the image data is identified by using an image identification algorithm, and sensing results of the two modules are fused to realize the purpose of accurately sensing the environment. The invention comprises the following steps:
(1) manually driving an unmanned vehicle to run on a real road, and collecting ground laser radar point cloud data and image data;
(2) using the collected image data as a reference, calibrating the laser radar point cloud data and the image data, and marking the collected laser radar point cloud data;
(3) initializing a point cloud segmentation network, training marked laser radar point cloud data to reduce errors between real categories and classified categories as targets, and updating network parameters until the network converges or runs to the maximum training times;
(4) and transplanting the trained network into the unmanned vehicle manual control machine, and segmenting the laser radar point cloud collected in real time by using the point cloud segmentation network in the driving process of the unmanned vehicle to obtain the class of the object to which the point cloud belongs.
(5) The image data is identified using an image identification algorithm.
(6) And fusing the partitioned laser radar point cloud data and the image data after image recognition to obtain the positions of roads and objects, thereby achieving the function of accurately sensing the environment.
Dividing the laser radar point cloud collected in real time, mapping the three-dimensional point cloud to a two-dimensional plane image, and dividing the two-dimensional plane image by using a deep learning technology to obtain the category of each pixel in the two-dimensional plane image; and mapping the category of each pixel in the two-dimensional plane image back to the category of the three-dimensional point cloud, and performing clustering operation on the mapped three-dimensional point cloud category to eliminate burrs and shadows generated in the mapping process to obtain the segmented laser radar point cloud.
And in the data collection stage, the unmanned vehicle is provided with a laser radar and a camera, the unmanned vehicle is manually driven, ground laser radar point cloud data and picture data are collected, each frame of laser radar point cloud data is stored in a 4 XHXW format, wherein 4 of the first dimension represents (x, y, z, permission) information, H represents the vertical resolution of the laser radar, and W represents the horizontal resolution of the laser radar.
The marked point cloud data shape is 5 × H × W, where the first dimension represents the information of (x, y, z, permission, label).
Mapping the marked three-dimensional space point cloud data to a two-dimensional plane image, and referring to a formula according to the coordinates (x, y, z) of the three-dimensional point cloud
Figure BDA0003036672620000021
Calculating coordinates (u, v) of the three-dimensional point cloud in the two-dimensional plane image, wherein u and v respectively represent an abscissa and an ordinate after the three-dimensional point cloud is mapped to the two-dimensional plane image, w and h respectively represent the width and the height of the mapped two-dimensional plane image, r represents the range from the three-dimensional point cloud to an origin, fupAnd fdownAnd f represents the sum of the absolute values of the maximum value and the minimum value of the laser radar ray pitch angle, so as to obtain tensors of (w, h, 5).
The deep learning technology used for laser radar point cloud segmentation divides point cloud data by using a segmentation network, and a network main part comprises a down-sampling coding block, an up-sampling coding block and category calculation. The down-sampling coding block performs down-sampling on the plane image, so that the time required by processing is reduced; the up-sampling carries out up-sampling on the plane image, and adds the image with the corresponding dimensionality of the down-sampling before, supplements details, and gradually restores to the input dimensionality; and calculating the recovered plane image by category calculation to obtain the category of each pixel of the plane, calculating the error between the output category and the real category, and updating the network until convergence or the maximum training times is reached.
The down-sampling coding block is realized by the following steps:
11) and splicing the distances between the two-dimensional plane images and the corresponding points into a tensor of 5 multiplied by h multiplied by w. Wherein h and w respectively represent the height and width of the plane picture, the 1 st dimension represents x, y, z, r and permission information of a point corresponding to the pixel, x, y and z respectively represent three-dimensional coordinates of the point, r represents the distance from the point to the host vehicle, and permission represents the reflection intensity of the point.
12) The input image is down-sampled with a sampling step size of 2 × 1, h (height) of the down-sampled image is unchanged, and w (width) is one-half of the original value.
13) And (3) performing convolution on an input image, and performing convolution on a downsampled image by using a residual block, wherein the residual block uses two layers of CONV + BN + ReLU networks.
14) Repeating the steps 12) to 13) for a preset number of times.
The up-sampling coding block is realized by the following steps:
21) the method comprises the steps of up-sampling an input image, wherein the sampling step length is 2 multiplied by 1, the height of the image after up-sampling is unchanged, the length of the image after up-sampling is twice that of the image after up-sampling, and in addition, data (corresponding to the same level) with the same dimensionality as the image in the down-sampling process are recovered, so that the loss of details is caused by down-sampling.
22) And (3) performing convolution on an input image, and performing convolution on the up-sampled image by using a residual block, wherein the residual block uses a two-layer CONV + BN + ReLU network.
23) Repeat 21) -22) until the data is restored to the dimensions of the original image.
The category calculation comprises the following steps:
31) and performing 1 × 1 convolution on the image restored to the original dimension to generate an n × h × w image, wherein n represents the total number of categories.
32) The probability of occurrence of each category is calculated using the softmax function on the output image,
Figure BDA0003036672620000031
selecting the class with the highest probability as the class of the pixel, wherein c represents the class, logitcRepresenting the value output by the corresponding pixel class c after convolution,
Figure BDA0003036672620000041
representing the probability of selecting the category after calculation by the softmax function.
33) Updating the network parameters according to a loss function, the formula of which is:
Figure BDA0003036672620000042
wherein the content of the first and second substances,
Figure BDA0003036672620000043
fcindicating the frequency of occurrence of class c, ε being a very small number prevent (f)c+ ε) is 0 resulting in no sense in taking the logarithm, ycIndicating whether the real category of the corresponding point is c (1 if yes, 0 if not),
Figure BDA0003036672620000044
representing the probability that the corresponding pixel output class is c, the penalty function aims to reduce the error between the output class and the true class, while solving the class imbalance problem.
Clustering processing is carried out on the neural network classification result through a KNN method, and the defect that burrs and shadows are generated when a two-dimensional plane is mapped back to a three-dimensional space is overcome. For each pixel of the two-dimensional plane, the class is determined according to the classes of the surrounding pixels.
And mapping the two-dimensional plane back to the three-dimensional space to obtain the category of each point in the original three-dimensional space. And the error of the real class and the output class of each point in the three-dimensional space is used as a loss function, and the minimum loss function is used as the target of training the network to optimize the network until the network converges or the maximum training times is reached.
And calibrating the laser radar and the camera. Calibrating the camera by using calibration Tookit of auto to obtain parameters (f) of the camerau,fv,u0,v0) Wherein (f)u,fv) Is a planar XY-axis direction scale factor, (u)0,v0) Is the center point of the plane. The camera and the laser radar are jointly calibrated by using the calibration Tookit of the auto to obtain a rotation matrix R and a translation vector t, and the formula for converting the radar point cloud coordinate into the image coordinate is
Figure BDA0003036672620000045
Where (x, y, z) is the three-dimensional coordinates of the point cloud and (u, v) is the coordinates of the point cloud converted into an image.
The method comprises the steps of using a trained radar point cloud segmentation model to segment laser radar point cloud of the unmanned vehicle, using image recognition to recognize image data, fusing the recognized data according to the formula, and converting the laser radar point cloud into an image, so that the unmanned accurate perception environment is achieved.
Compared with the prior art, the invention has the following advantages:
the method is not interfered by factors such as weather, light and the like, and the identification effect is stable. The method can still show a good recognition effect under the conditions of poor weather, light and the like.
The object identified by the method is three-dimensional space information, not only has plane information, but also has depth information, and the identification effects on the near object and the far object are not greatly different.
The method can segment one frame within the single-frame time interval of the laser radar, and can achieve the real-time effect.
Drawings
Fig. 1 is an overall frame diagram of the present invention.
FIG. 2 is a schematic view of a lidar camera fusion of the present invention;
FIG. 3 is a flow chart of network training according to the present invention.
Detailed Description
The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.
The driving environment perception method combining image recognition and laser radar point cloud segmentation provides a laser radar point cloud segmentation algorithm, the laser radar point cloud is segmented by using the algorithm, and image data is recognized by using a mature image recognition algorithm. And fusing the segmented point cloud and the identified image to obtain accurate environment perception.
In a laser radar point cloud segmentation algorithm, a method for mapping a three-dimensional point cloud to a two-dimensional plane image is provided, the two-dimensional plane image is segmented by using a deep learning technology, and the category of each pixel in the two-dimensional plane image is obtained; and mapping the category of each pixel in the two-dimensional plane image back to the category of the three-dimensional point cloud, and performing clustering operation on the mapped three-dimensional point cloud category to eliminate burrs and shadows generated in the mapping process to obtain the segmented laser radar point cloud.
The method for mapping the three-dimensional point cloud to the two-dimensional plane image comprises the following steps: according to the coordinates (x, y, z) of the three-dimensional point cloud, referring to the formula
Figure BDA0003036672620000051
Calculating the coordinates of the three-dimensional point cloud in a two-dimensional plane, wherein u and v respectively represent the abscissa and the ordinate after the three-dimensional point cloud is mapped to the two-dimensional plane image, w and h respectively represent the width and the height of the mapped two-dimensional plane image, r represents the distance from the three-dimensional point cloud to the host vehicle, fupAnd fdownThe absolute values of the maximum value and the minimum value of the laser radar ray pitch angle are respectively represented, and f represents the sum of the two values.
The deep learning technology used for laser radar point cloud segmentation divides point cloud data by using a division network, wherein a network main body part comprises a down-sampling coding block, an up-sampling coding block and category calculation. The down-sampling coding speed carries out down-sampling on the plane image, so that the time required by processing is reduced; the up-sampling carries out up-sampling on the plane image, and adds the image with the corresponding dimensionality of the down-sampling before, supplements details, and gradually restores to the input dimensionality; and calculating the recovered plane image by category calculation to obtain the category of each pixel of the plane, calculating the error between the output category and the real category, and updating the network until convergence or the maximum training times is reached.
The implementation of the down-sampling coding block comprises the following steps:
(1) and splicing the distances between the obtained two-dimensional plane images and the corresponding points into a tensor of 5 multiplied by h multiplied by w. Wherein h and w respectively represent the height and width of the plane picture, the 1 st dimension represents x, y, z, r and permission information of a point corresponding to the pixel, x, y and z respectively represent three-dimensional coordinates of the point, r represents the distance from the point to the host vehicle, and permission represents the reflection intensity of the point.
(2) Down-sampling the input image with a sampling step length of 1 × 2, and keeping the height of the down-sampled image unchangedWhen length becomes input
Figure BDA0003036672620000061
(3) And (3) performing convolution on an input image, and performing convolution on the downsampled image by using a residual block, wherein the residual block adopts a two-layer CONV + BN + ReLU network structure.
(4) And (4) repeating the steps (2) and (3) for a preset number of times according to training experience.
The implementation of the upsampling coding block comprises the following steps:
(1) the method comprises the steps of up-sampling an input image, wherein the sampling step length is 1 x 2, the height of the image after up-sampling is unchanged, the length of the image is 2 times of the input length, and in addition, the image with the same dimensionality as the image in the down-sampling process restores the details lost due to down-sampling.
(2) And (3) performing convolution on an input image, and performing convolution on the up-sampled image by using a residual block, wherein the residual block adopts a two-layer CONV + BN + ReLU network structure.
(3) And (3) repeating the steps (1) and (2) until the dimensionality of the original image is restored.
The category calculation comprises the following steps:
(1) and performing 1 × 1 convolution on the image restored to the original dimension to generate an n × h × w image, wherein n represents the total number of categories.
(2) The probability of occurrence of each category is calculated using the softmax function on the output image,
Figure BDA0003036672620000062
wherein c represents a category, logitcRepresenting the value output by the corresponding pixel class c after convolution,
Figure BDA0003036672620000063
representing the probability of selecting the category after calculation by the softmax function.
(3) Updating the network parameters according to a loss function, the formula of which is:
Figure BDA0003036672620000064
wherein,
Figure BDA0003036672620000065
fcIndicating the frequency of occurrence of class c, ε being a very small number prevent (f)c+ ε) is 0 resulting in no sense in taking the logarithm, ycThe true category of the corresponding point is represented,
Figure BDA0003036672620000066
representing the output class of the corresponding pixel, the penalty function is aimed at reducing the error between the output class and the true class, while solving the class imbalance problem.
Clustering the three-dimensional point cloud category, and storing S surrounding each pixel in the divided plane image by taking the pixel as a central pixel and using a sliding window with the size of S multiplied by S2And calculating the absolute value of the difference value between the distance of the pixels and the distance of the central pixel, calculating the probability of each pixel being selected through two standard normal distributions, multiplying the absolute value of the distance difference value by the probability to obtain the distance, sequencing according to the distance, counting the categories of the first K points, and taking the category with the largest occurrence frequency as the final category.
In the image recognition algorithm, an open-source fast-RCNN framework and a model are adopted to segment images received by a camera in real time, and detection results of objects, travelable areas and traveling paths are obtained.
And fusing the image recognition result and the segmentation result of the laser radar point cloud, calibrating the laser radar and the camera to obtain the conversion parameter from the laser radar to the camera, converting the segmentation result of the laser radar point cloud into the coordinate system of the camera, and realizing the fusion of the two sensors.
The combination of image recognition and lidar point cloud segmentation for unmanned environment perception also requires hardware including: (1) laser radar: the system is used for collecting surrounding environment point cloud information; (2) monocular camera: for collecting ambient picture data; (3) an industrial personal computer: the method is used for radar point cloud data point cloud segmentation, image identification and fusion.
FIG. 1 is an overall frame diagram of the present invention, collecting point cloud data using a lidar, preprocessing the data, training a point cloud segmentation network using the preprocessed data with categories, segmenting a real-time lidar point cloud using the point cloud segmentation network, identifying an image using an image recognition algorithm, and fusing the two recognition results.
Fig. 2 is a schematic diagram of the integration of a laser radar and a camera in the present invention.
FIG. 3 is a network training flowchart of the present invention, wherein the network training diagram is divided into two steps, the first step is a down-sampling encoding process, the second step is an up-sampling encoding process, and the third step is a category calculation process; in the downsampling coding process, firstly, convolution (using a two-layer residual block) is carried out on the preprocessed image, then downsampling is carried out, then convolution is carried out again, and then downsampling is carried out again for proper times; and in the up-sampling encoding process, the up-sampling encoding result is up-sampled, then image data with the same dimensionality in the corresponding down-sampling process is added, convolution is carried out, and the like until the original dimensionality is restored, 1 × 1 convolution is carried out for the last time, the output dimensionality is C × h × w, wherein C is the total number of marks, the probability of each mark is calculated through softmax, and the class with the maximum probability as a pixel is selected.
Following are pseudo-codes of the KNN algorithm of the present invention, respectively, collecting S around each pixel in the segmentation plane using a sliding window of size S × S2Calculating S from the absolute value of the pixel distance and the difference between the pixel distances2The probability of each pixel being selected is then multiplied by the distance to obtain a value, the values are sorted, and the one with the highest occurrence number of the first K elements is selected as the label of the pixel.
Figure BDA0003036672620000081
Figure BDA0003036672620000082
Figure BDA0003036672620000091

Claims (10)

1. A driving environment perception method involving combining image recognition and laser radar point cloud segmentation is characterized by comprising the following steps:
(1) manually driving an unmanned vehicle to run on a real road, and collecting ground laser radar point cloud data and image data;
(2) using the collected image data as a reference, calibrating the laser radar point cloud data and the image data, and marking the collected laser radar point cloud data;
(3) initializing a point cloud segmentation network, and training marked laser radar point cloud data until the network converges or runs to the maximum training times;
(4) transplanting the trained network into an unmanned vehicle manual control machine to obtain the category of the object to which the point cloud belongs;
(5) identifying the image data by using an image identification algorithm;
(6) and fusing the segmented laser radar point cloud data and the image data after image recognition to obtain the positions of the road and the object.
2. The driving environment sensing method related to the combination of image recognition and lidar point cloud segmentation as claimed in claim 1, wherein the lidar point cloud collected in real time is segmented, the three-dimensional point cloud is mapped to the two-dimensional plane image, and the two-dimensional plane image is segmented using a deep learning technique to obtain a category of each pixel in the two-dimensional plane image; and mapping the category of each pixel in the two-dimensional plane image back to the category of the three-dimensional point cloud, and performing clustering operation on the mapped three-dimensional point cloud category to eliminate burrs and shadows generated in the mapping process to obtain the segmented laser radar point cloud.
3. The method as claimed in claim 1, wherein the step of collecting data includes installing a laser radar and a camera on the unmanned vehicle, manually driving the unmanned vehicle, and collecting ground laser radar point cloud data and picture data, and each frame of laser radar point cloud data is stored in a 4 x H x W format, wherein 4 in the first dimension represents (x, y, z, mission) information, H represents the vertical resolution of the laser radar, and W represents the horizontal resolution of the laser radar.
4. The method of claim 1, wherein the labeled three-dimensional space point cloud data is mapped to a two-dimensional plane image, and formula referencing is performed according to coordinates (x, y, z) of the three-dimensional point cloud
Figure FDA0003036672610000011
Calculating coordinates (u, v) of the three-dimensional point cloud in the two-dimensional plane image, wherein u and v respectively represent an abscissa and an ordinate after the three-dimensional point cloud is mapped to the two-dimensional plane image, w and h respectively represent the width and the height of the mapped two-dimensional plane image, r represents the range from the three-dimensional point cloud to an origin, fupAnd fdownAnd f represents the sum of the absolute values of the maximum value and the minimum value of the laser radar ray pitch angle, so as to obtain tensors of (w, h, 5).
5. The method as claimed in claim 2, wherein the deep learning technique is used to segment the point cloud data by using a segmentation network, and the network main part comprises a down-sampling coding block, an up-sampling coding block and a category calculation; the down-sampling coding block performs down-sampling on the plane image, so that the time required by processing is reduced; the up-sampling carries out up-sampling on the plane image, and adds the image with the corresponding dimensionality of the down-sampling before, supplements details, and gradually restores to the input dimensionality; and calculating the recovered plane image by category calculation to obtain the category of each pixel of the plane, calculating the error between the output category and the real category, and updating the network until convergence or the maximum training times is reached.
6. The method for sensing driving environment in combination with image recognition and lidar point cloud segmentation as claimed in claim 5, wherein the down-sampling encoding block is implemented by:
11) splicing the distances between the two-dimensional plane images and the corresponding points into a tensor of 5 multiplied by h multiplied by w; h and w respectively represent the height and width of a plane picture, the 1 st dimension represents x, y, z, r and permission information of a point corresponding to the pixel, x, y and z respectively represent three-dimensional coordinates of the point, r represents the distance from the point to the host vehicle, and permission represents the reflection intensity of the point;
12) down-sampling the input image, wherein the sampling step length is 2 multiplied by 1, h of the down-sampled image is unchanged, and w is changed to be one half of the original value;
13) performing convolution on an input image, and performing convolution on a downsampled image by using a residual block, wherein the residual block uses two layers of CONV + BN + ReLU networks;
14) repeating the steps 12) to 13) for a preset number of times.
7. The method for sensing driving environment in combination with image recognition and lidar point cloud segmentation as claimed in claim 5, wherein the upsampling coding block is implemented by:
21) the method comprises the steps of up-sampling an input image, wherein the sampling step length is 2 multiplied by 1, the height of the image after up-sampling is unchanged, the length of the image is twice that of the image during input, and in addition, data with the same dimensionality as the image during the down-sampling process are recovered, so that the loss of details caused by the down-sampling is reduced;
22) performing convolution on an input image, and performing convolution on an up-sampled image by using a residual block, wherein the residual block uses a two-layer CONV + BN + ReLU network;
23) repeat 21) -22) until the data is restored to the dimensions of the original image.
8. The method of claim 5, wherein the class calculation comprises the steps of:
31) performing 1 × 1 convolution on the image restored to the original dimension to generate an n × h × w image, wherein n represents the total number of categories;
32) the probability of occurrence of each category is calculated using the softmax function on the output image,
Figure FDA0003036672610000021
selecting the class with the highest probability as the class of the pixel, wherein c represents the class, logitcRepresenting the value output by the corresponding pixel class c after convolution,
Figure FDA0003036672610000031
representing the probability of selecting the category after calculation by the softmax function;
33) updating the network parameters according to a loss function, the formula of which is:
Figure FDA0003036672610000032
wherein the content of the first and second substances,
Figure FDA0003036672610000033
fcindicating the frequency of occurrence of class c, ε being a very small number prevent (f)c+ ε) is 0 resulting in no sense in taking the logarithm, ycIndicating whether the true category of the corresponding point is c,
Figure FDA0003036672610000034
representing the probability that the output class of the corresponding pixel is c.
9. The method as claimed in claim 2, wherein the three-dimensional point cloud is clustered, and each pixel in the planar image after segmentation is stored by using a sliding window with size of S x S and the pixel as a center pixelSurrounding S2And calculating the absolute value of the difference value between the distance of the pixels and the distance of the central pixel, calculating the probability of each pixel being selected through two standard normal distributions, multiplying the absolute value of the distance difference value by the probability to obtain the distance, sequencing according to the distance, counting the categories of the first K points, and taking the category with the largest occurrence frequency as the final category.
10. The method of claim 1, wherein the method requires hardware for operation, and wherein the hardware comprises: (1) laser radar: the system is used for collecting surrounding environment point cloud information; (2) monocular camera: for collecting ambient picture data; (3) an industrial personal computer: the method is used for radar point cloud data point cloud segmentation, image identification and fusion.
CN202110445391.8A 2021-04-25 2021-04-25 Driving environment sensing method combining image recognition and laser radar point cloud segmentation Pending CN113269040A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110445391.8A CN113269040A (en) 2021-04-25 2021-04-25 Driving environment sensing method combining image recognition and laser radar point cloud segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110445391.8A CN113269040A (en) 2021-04-25 2021-04-25 Driving environment sensing method combining image recognition and laser radar point cloud segmentation

Publications (1)

Publication Number Publication Date
CN113269040A true CN113269040A (en) 2021-08-17

Family

ID=77229378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110445391.8A Pending CN113269040A (en) 2021-04-25 2021-04-25 Driving environment sensing method combining image recognition and laser radar point cloud segmentation

Country Status (1)

Country Link
CN (1) CN113269040A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705655A (en) * 2021-08-24 2021-11-26 北京建筑大学 Full-automatic classification method for three-dimensional point cloud and deep neural network model
CN113762413A (en) * 2021-09-30 2021-12-07 智道网联科技(北京)有限公司 Point cloud data and image data fusion method and storage medium
CN113838030A (en) * 2021-09-24 2021-12-24 北京杰迈科技股份有限公司 Turnout state detection method
CN115148040A (en) * 2022-06-28 2022-10-04 东莞中科云计算研究院 Unmanned vehicle control method and system for closed road environment
CN115145272A (en) * 2022-06-21 2022-10-04 大连华锐智能化科技有限公司 Coke oven vehicle environment sensing system and method
CN115240093A (en) * 2022-09-22 2022-10-25 山东大学 Automatic power transmission channel inspection method based on visible light and laser radar point cloud fusion
WO2023045044A1 (en) * 2021-09-27 2023-03-30 北京大学深圳研究生院 Point cloud coding method and apparatus, electronic device, medium, and program product
WO2024044887A1 (en) * 2022-08-29 2024-03-07 Huawei Technologies Co., Ltd. Vision-based perception system
CN113838030B (en) * 2021-09-24 2024-05-14 北京杰迈科技股份有限公司 Switch state detection method

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108596860A (en) * 2018-05-10 2018-09-28 芜湖航飞科技股份有限公司 A kind of ground point cloud dividing method based on three-dimensional laser radar
CN109670411A (en) * 2018-11-30 2019-04-23 武汉理工大学 Based on the inland navigation craft point cloud data depth image processing method and system for generating confrontation network
CN109934230A (en) * 2018-09-05 2019-06-25 浙江大学 A kind of radar points cloud dividing method of view-based access control model auxiliary
CN110738200A (en) * 2019-12-23 2020-01-31 广州赛特智能科技有限公司 Lane line 3D point cloud map construction method, electronic device and storage medium
CN110853037A (en) * 2019-09-26 2020-02-28 西安交通大学 Lightweight color point cloud segmentation method based on spherical projection
CN111026127A (en) * 2019-12-27 2020-04-17 南京大学 Automatic driving decision method and system based on partially observable transfer reinforcement learning
CN111274976A (en) * 2020-01-22 2020-06-12 清华大学 Lane detection method and system based on multi-level fusion of vision and laser radar
CN112396650A (en) * 2020-03-30 2021-02-23 青岛慧拓智能机器有限公司 Target ranging system and method based on fusion of image and laser radar
WO2021041854A1 (en) * 2019-08-30 2021-03-04 Nvidia Corporation Object detection and classification using lidar range images for autonomous machine applications
DE102019127282A1 (en) * 2019-10-10 2021-04-15 Valeo Schalter Und Sensoren Gmbh System and method for analyzing a three-dimensional environment through deep learning

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108596860A (en) * 2018-05-10 2018-09-28 芜湖航飞科技股份有限公司 A kind of ground point cloud dividing method based on three-dimensional laser radar
CN109934230A (en) * 2018-09-05 2019-06-25 浙江大学 A kind of radar points cloud dividing method of view-based access control model auxiliary
CN109670411A (en) * 2018-11-30 2019-04-23 武汉理工大学 Based on the inland navigation craft point cloud data depth image processing method and system for generating confrontation network
WO2021041854A1 (en) * 2019-08-30 2021-03-04 Nvidia Corporation Object detection and classification using lidar range images for autonomous machine applications
CN110853037A (en) * 2019-09-26 2020-02-28 西安交通大学 Lightweight color point cloud segmentation method based on spherical projection
DE102019127282A1 (en) * 2019-10-10 2021-04-15 Valeo Schalter Und Sensoren Gmbh System and method for analyzing a three-dimensional environment through deep learning
CN110738200A (en) * 2019-12-23 2020-01-31 广州赛特智能科技有限公司 Lane line 3D point cloud map construction method, electronic device and storage medium
CN111026127A (en) * 2019-12-27 2020-04-17 南京大学 Automatic driving decision method and system based on partially observable transfer reinforcement learning
CN111274976A (en) * 2020-01-22 2020-06-12 清华大学 Lane detection method and system based on multi-level fusion of vision and laser radar
CN112396650A (en) * 2020-03-30 2021-02-23 青岛慧拓智能机器有限公司 Target ranging system and method based on fusion of image and laser radar

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
STEVEN K. FILIPPELLI;MICHAEL A. LEFSKY;MONIQUE E. ROCCA: "Comparison and integration of lidar and photogrammetric point clouds for mapping pre-fire forest structure", REMOTE SENSING OF ENVIRONMENT, vol. 224 *
YANG, ZHANG;ZHEN, LIU;XIANG, LI;YU, ZANG: "Data-Driven Point Cloud Objects Completion.", SENSORS (BASEL, SWITZERLAND), vol. 19, no. 7 *
孔德明;张娜;黄紫双;陈晓玉;沈阅;: "基于激光雷达探测技术的列车车厢载货体积测量方法研究", 燕山大学学报, no. 02 *
谢波;赵亚男;高利;高峰;: "基于激光雷达点云的小目标语义分割增强方法", 激光杂志, no. 04 *
钱煜;俞扬;周志华;: "一种基于自生成样本学习的奖赏塑形方法", 软件学报, no. 11 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705655A (en) * 2021-08-24 2021-11-26 北京建筑大学 Full-automatic classification method for three-dimensional point cloud and deep neural network model
CN113705655B (en) * 2021-08-24 2023-07-18 北京建筑大学 Three-dimensional point cloud full-automatic classification method and deep neural network model
CN113838030A (en) * 2021-09-24 2021-12-24 北京杰迈科技股份有限公司 Turnout state detection method
CN113838030B (en) * 2021-09-24 2024-05-14 北京杰迈科技股份有限公司 Switch state detection method
WO2023045044A1 (en) * 2021-09-27 2023-03-30 北京大学深圳研究生院 Point cloud coding method and apparatus, electronic device, medium, and program product
CN113762413B (en) * 2021-09-30 2023-12-26 智道网联科技(北京)有限公司 Point cloud data and image data fusion method and storage medium
CN113762413A (en) * 2021-09-30 2021-12-07 智道网联科技(北京)有限公司 Point cloud data and image data fusion method and storage medium
CN115145272A (en) * 2022-06-21 2022-10-04 大连华锐智能化科技有限公司 Coke oven vehicle environment sensing system and method
CN115145272B (en) * 2022-06-21 2024-03-29 大连华锐智能化科技有限公司 Coke oven vehicle environment sensing system and method
CN115148040A (en) * 2022-06-28 2022-10-04 东莞中科云计算研究院 Unmanned vehicle control method and system for closed road environment
WO2024044887A1 (en) * 2022-08-29 2024-03-07 Huawei Technologies Co., Ltd. Vision-based perception system
CN115240093B (en) * 2022-09-22 2022-12-23 山东大学 Automatic power transmission channel inspection method based on visible light and laser radar point cloud fusion
CN115240093A (en) * 2022-09-22 2022-10-25 山东大学 Automatic power transmission channel inspection method based on visible light and laser radar point cloud fusion

Similar Documents

Publication Publication Date Title
CN113269040A (en) Driving environment sensing method combining image recognition and laser radar point cloud segmentation
CN111798475B (en) Indoor environment 3D semantic map construction method based on point cloud deep learning
CN110675418B (en) Target track optimization method based on DS evidence theory
CN107563372B (en) License plate positioning method based on deep learning SSD frame
CN110222626B (en) Unmanned scene point cloud target labeling method based on deep learning algorithm
CN110570429B (en) Lightweight real-time semantic segmentation method based on three-dimensional point cloud
CN111626217A (en) Target detection and tracking method based on two-dimensional picture and three-dimensional point cloud fusion
CN111598098B (en) Water gauge water line detection and effectiveness identification method based on full convolution neural network
CN112949633B (en) Improved YOLOv 3-based infrared target detection method
CN111046781B (en) Robust three-dimensional target detection method based on ternary attention mechanism
CN114359181B (en) Intelligent traffic target fusion detection method and system based on image and point cloud
CN109919026B (en) Surface unmanned ship local path planning method
CN112329623A (en) Early warning method for visibility detection and visibility safety grade division in foggy days
CN111967373B (en) Self-adaptive enhanced fusion real-time instance segmentation method based on camera and laser radar
CN113095152B (en) Regression-based lane line detection method and system
CN111339830A (en) Target classification method based on multi-modal data features
CN109492700A (en) A kind of Target under Complicated Background recognition methods based on multidimensional information fusion
CN112861700A (en) DeepLabv3+ based lane line network identification model establishment and vehicle speed detection method
CN116385958A (en) Edge intelligent detection method for power grid inspection and monitoring
CN115861619A (en) Airborne LiDAR (light detection and ranging) urban point cloud semantic segmentation method and system of recursive residual double-attention kernel point convolution network
CN115359474A (en) Lightweight three-dimensional target detection method, device and medium suitable for mobile terminal
CN112613392A (en) Lane line detection method, device and system based on semantic segmentation and storage medium
CN112288667A (en) Three-dimensional target detection method based on fusion of laser radar and camera
CN115100741A (en) Point cloud pedestrian distance risk detection method, system, equipment and medium
CN114266947A (en) Classification method and device based on fusion of laser point cloud and visible light image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination