CN115223129A - Laser radar-based road detection method, device, medium and computer equipment - Google Patents

Laser radar-based road detection method, device, medium and computer equipment Download PDF

Info

Publication number
CN115223129A
CN115223129A CN202211056479.1A CN202211056479A CN115223129A CN 115223129 A CN115223129 A CN 115223129A CN 202211056479 A CN202211056479 A CN 202211056479A CN 115223129 A CN115223129 A CN 115223129A
Authority
CN
China
Prior art keywords
road detection
input image
grid
points
point cloud
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211056479.1A
Other languages
Chinese (zh)
Inventor
谢修祥
钱炜
吕悦川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhilianan Technology Co ltd
Original Assignee
Beijing Zhilianan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhilianan Technology Co ltd filed Critical Beijing Zhilianan Technology Co ltd
Priority to CN202211056479.1A priority Critical patent/CN115223129A/en
Publication of CN115223129A publication Critical patent/CN115223129A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/48Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Evolutionary Computation (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Electromagnetism (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a road detection method, a road detection device, a road detection medium and computer equipment based on a laser radar, relates to the technical field of road detection, and is used for solving the problems that the road detection process is complex and the processing speed is slow. The laser radar-based road detection method comprises the following steps: acquiring a point cloud of a laser radar for a target scene; generating a two-dimensional grid plane based on the point cloud; projecting the point cloud to a grid plane to obtain a projection result, wherein the projection result comprises points contained in each grid in the grid plane and statistical information of the points; generating an input image based on the projection result, wherein pixel points in the input image correspond to grids in a grid plane one by one, and the pixel values of the pixel points are determined based on the statistical information of the points contained in the corresponding grids; the input image is input into a pre-trained semantic segmentation model, and the road area in the target scene is predicted, so that the classification problem is simplified into a single-dimensional problem, the road detection process is simplified, and the road detection efficiency is improved.

Description

Laser radar-based road detection method, device, medium and computer equipment
Technical Field
The present application relates to the field of road detection technologies, and in particular, to a method, an apparatus, a medium, and a computer device for road detection based on a laser radar.
Background
The road detection technology is widely applied to the fields of automatic driving, vehicle intelligent control and the like, a monocular camera is usually adopted to shoot a road image, road detection is realized by analyzing the road image, the detection process is complex, and the processing speed is low.
Disclosure of Invention
In order to solve the technical problem, the application provides a method, a device, a medium and computer equipment for road detection based on a laser radar.
In a first aspect of the present application, a method for detecting a road based on a lidar is provided, where the method for detecting a road based on a lidar includes:
acquiring a point cloud of a laser radar for a target scene;
generating a two-dimensional grid plane based on the point cloud of the target scene;
projecting the point cloud to the grid plane to obtain a projection result, wherein the projection result comprises points and statistical information thereof contained in each grid in the grid plane;
generating an input image based on the projection result, wherein pixel points in the input image correspond to grids in the grid plane one by one, and the pixel values of the pixel points are determined based on the statistical information of the points contained in the corresponding grids;
and inputting the input image into a pre-trained semantic segmentation model, and predicting a road area in the target scene.
In some embodiments of the present application, the two-dimensional grid plane is a top view plane.
In some embodiments of the present application, the statistical information corresponding to the grid includes the number of points included in the grid, an average reflectivity, a mean, a standard deviation, a minimum elevation angle, and a maximum elevation angle.
In some embodiments of the present application, each of the grids corresponds to a plurality of different statistical information, the input image includes a plurality of single-channel images, and each of the single-channel images corresponds to one of the statistical information.
In some embodiments of the present application, the semantic segmentation model is a full convolution neural network model.
In some embodiments of the present application, inputting the input image into a pre-trained semantic segmentation model, and predicting a road region in the target scene, includes:
inputting the plurality of single-channel images into a multi-channel input layer of the full convolution neural network model, wherein each channel corresponds to one single-channel image;
a hidden layer of the full convolution neural network model performs downsampling processing on a plurality of input single-channel images to obtain a downsampled feature map;
the hidden layer carries out expansion convolution processing with different expansion rates on the downsampling feature map for multiple times to obtain an expansion feature map;
the hidden layer performs upsampling processing on the expansion feature map to obtain an upsampling feature map, wherein the size of the upsampling feature map is the same as that of the single-channel image;
and an output layer of the full convolution neural network model performs feature mapping on the up-sampling feature map and outputs a confidence map, wherein the confidence map is used for representing the probability of each pixel point in the road area.
In some embodiments of the present application, the hidden layer performs a plurality of dilation convolution processes with different dilation rates on the downsampled feature map to obtain a dilation feature map, including:
the hidden layer carries out expansion convolution processing on the downsampling feature map for N times, wherein N is a positive integer larger than 2;
wherein the first expansion convolution processThe expansion ratio of (2) is (1,1) and the expansion ratio of the n-th expansion convolution process is (2) n -1 ,2 n ) N is more than 1 and less than or equal to N, and N is a positive integer.
In a second aspect of the present application, a lidar-based road detection apparatus is provided, which includes:
the acquisition module is configured to acquire a point cloud of the laser radar for a target scene;
a mesh plane generation module configured to generate a two-dimensional mesh plane based on the point cloud of the target scene;
the projection module is configured to project the point cloud to the grid plane to obtain a projection result, and the projection result comprises points contained in each grid in the grid plane and statistical information of the points;
an input image generation module configured to generate an input image based on the projection result, where pixel points in the input image correspond to grids in the grid plane one to one, and pixel values of the pixel points are determined based on statistical information of points included in the corresponding grids;
and the processing module is configured to input the input image into a pre-trained semantic segmentation model and predict a road area in the target scene.
In a third aspect of the present application, a computer-readable storage medium is provided, having stored thereon a computer program which, when executed, carries out the steps of the lidar based road detection method as described above.
In a fourth aspect of the present application, a computer device is provided, comprising a processor, a memory and a computer program stored on the memory, the processor implementing the steps of the lidar based road detection method as described above when executing the computer program.
In the application, the point cloud of the target scene is projected to a two-dimensional grid plane, an input image is generated according to each point projected to the grid plane and statistical information of the point cloud, the input image is input into a semantic segmentation model, so that a road area in the target scene is predicted, the point cloud is projected to the two-dimensional grid plane, so that the classification problem is simplified into a one-dimensional problem, and therefore, the prediction of the road area can be rapidly realized through the simple semantic segmentation model, the road detection process is simplified, and the road detection efficiency is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:
FIG. 1 is a flow chart illustrating a method of a lidar based road detection method in accordance with an exemplary embodiment of the present application;
FIG. 2 is a schematic diagram of a projection of a point cloud onto a mesh plane as shown in an exemplary embodiment of the present application;
FIG. 3 is a flow chart of a method of lidar-based road detection in accordance with another exemplary embodiment of the present application;
FIG. 4 is a schematic structural diagram of an encoder in a lidar-based road detection method according to an exemplary embodiment of the present application;
FIG. 5 is a schematic structural diagram illustrating a decoder in a lidar-based road detection method according to an exemplary embodiment of the present application;
FIG. 6 is a schematic structural diagram illustrating a linking module in a laser radar-based road detection method according to an exemplary embodiment of the present application;
FIG. 7 is a schematic diagram illustrating a configuration of a lidar-based road detection apparatus according to an exemplary embodiment of the present application;
fig. 8 is a schematic structural diagram of a lidar-based road detection device according to an exemplary embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
When road detection is carried out, a mode of combining an image shot by a monocular camera with a deep neural network is usually adopted, on one hand, the mode is large in data processing amount, complex in process and low in detection efficiency, on the other hand, a method for collecting images based on the camera is greatly influenced by ambient light, and the obtained result is often bad at night or under the condition different from the light condition adopted during training.
Laser radar (LIDAR) gathers information with the laser that self sent, and is insensitive to ambient light, and the accuracy is high, and in other correlation technique, the road detection is realized to the mode that adopts integration camera and LIDAR, has the same problem that the process is complicated, detection efficiency hangs down, also has certain limitation simultaneously.
In order to solve the technical problems, the application provides a road detection method based on a laser radar, which comprises the steps of projecting point clouds of a target scene to a two-dimensional grid plane, generating an input image according to each point projected to the grid plane and statistical information of the point clouds, inputting the input image into a semantic segmentation model, and predicting a road area in the target scene.
An exemplary embodiment of the present application provides a method for detecting a laser radar-based road, as shown in fig. 1, the method for detecting a laser radar-based road includes:
s100, point cloud of the laser radar for a target scene is obtained.
The target scene is, for example, a scene in which the vehicle is traveling, and the point cloud may be acquired by an acquisition device provided on the vehicle, for example. As can be appreciated, a point cloud is a data set that includes data information such as geometric coordinates, reflectivity, etc. of a plurality of points in a target scene. Illustratively, the point cloud is collected by a laser radar on the vehicle, and the laser radar is not sensitive to ambient light, so that the accuracy of a road detection result can be ensured, and the problem of limitation in road detection by adopting a camera image is solved.
S200, generating a two-dimensional grid plane based on the point cloud of the target scene.
As shown in fig. 2, generating a two-dimensional grid plane 100 can cover a point cloud 200 of a target scene in a direction perpendicular to the grid plane 100, where the grid plane 100 is divided into a plurality of grids, and the size of the grid may be set according to a specific resolution requirement, which is not limited in this application.
S300, projecting the point cloud to a grid plane to obtain a projection result, wherein the projection result comprises points and statistical information thereof contained in each grid in the grid plane.
As shown in fig. 2, each point in the point cloud 200 is projected onto the grid plane 100, for each grid in the grid plane 100, the point projected onto the grid can be determined, and data statistics, such as number statistics and data information statistics, are performed on the points in the grid to obtain statistical information. Illustratively, the statistical information corresponding to the grid includes the number of points contained in the grid, the average reflectivity, the mean, the standard deviation, the minimum elevation angle, and the maximum elevation angle. Wherein, the point cloud 200 projection can be performed by means of coordinate transformation.
S400, generating an input image based on the projection result, wherein pixel points in the input image correspond to grids in a grid plane one by one, and the pixel values of the pixel points are determined based on the statistical information of the points contained in the corresponding grids.
In this step, an input image is generated with each mesh in the mesh plane 100 as one pixel, so as to perform the subsequent road region prediction.
Considering that road planning and vehicle control are performed in a two-dimensional space, projecting the point cloud 200 onto the two-dimensional grid plane 100 is more advantageous than a perspective representation, so that the classification problem can be simplified into a single-dimensional problem.
S500, inputting the input image into a pre-trained semantic segmentation model, and predicting a road area in a target scene.
As described above, the point cloud 200 is projected onto the two-dimensional mesh plane 100 and an input image is generated based on the projection result, so that the input image can be quickly processed through a simple semantic segmentation model, thereby realizing prediction of a road region, simplifying a road detection process and improving road detection efficiency.
In an exemplary embodiment of the present application, the two-dimensional grid plane 100 is a plane of a top view angle, which can better reflect the surrounding environment information of the vehicle and facilitate the projection operation of the point cloud 200. Illustratively, as shown in fig. 2, the coordinates of the point in the point cloud 200 are (x, y, z), the two-dimensional grid plane 100 is an x-y plane, and the coordinates (x, y) are the coordinates of the point cloud 200 projected onto the two-dimensional grid plane 100.
When the statistical information of the points included in each grid is determined, one type of statistical information may be used, and one input image is generated, or a plurality of types of statistical information may be used, and one single-channel image is generated for each type of statistical information. The mean and the variance refer to the mean and the variance of the distance between the point and the k nearest points, and the value of k may be set according to specific requirements, and may be, for example, 10, 20, 50, and the like.
In this embodiment, a plurality of single-channel images are generated by using various statistical information to perform road detection, so that the detection accuracy of a road region can be further improved.
In an exemplary embodiment of the application, the semantic segmentation model is a full convolution neural network model, deep learning can be performed by using the full convolution neural network model (FCN), rapid processing of input data can be realized, and accuracy of a road detection result is guaranteed while detection efficiency is improved.
In an embodiment, as shown in fig. 3, step S500 specifically includes:
and S510, inputting a plurality of single-channel images into a multi-channel input layer of the full convolution neural network model, wherein each channel corresponds to a single-channel image.
As described above, in order to improve the accuracy of road detection, a plurality of single-channel images are generally generated using a plurality of types of statistical information, and therefore, in this step, a plurality of single-channel images are input to the multi-channel input layer of the full convolution neural network model, and one single-channel image corresponds to each channel. Illustratively, a first single-channel image is generated according to the statistical information of the number of points, a second single-channel image is generated according to the statistical information of the average reflectivity, a third single-channel image is generated according to the statistical information of the mean value, a fourth single-channel image is generated according to the statistical information of the standard deviation, a fifth single-channel image is generated according to the statistical information of the minimum elevation angle, and a sixth single-channel image is generated according to the statistical information of the maximum elevation angle. The multi-channel input layer is correspondingly set to be six and comprises a first channel, a second channel, a third channel, a fourth channel, a fifth channel and a sixth channel, wherein a first single-channel image generated according to the number of points is input into the first channel, a second single-channel image generated according to the average reflectivity is input into the second channel, a third single-channel image generated according to the average value is input into the third channel, a fourth single-channel image generated according to the standard deviation is input into the fourth channel, a fifth single-channel image generated according to the minimum elevation angle is input into the fifth channel, and a sixth single-channel image generated according to the maximum elevation angle is input into the sixth channel.
S520, the hidden layer of the full convolution neural network model performs down-sampling processing on the input single-channel images to obtain a down-sampling feature map.
In the step, the storage space requirement on the full convolution neural network model can be effectively reduced by downsampling a plurality of input single-channel images. Illustratively, an encoder is employed to downsample each single-channel image. As shown in fig. 4, the encoder includes a first encoding unit, a second encoding unit, and a third encoding unit, and an output pixel size W × H of the first encoding unit is: 200 × 400, 32 in number, the output pixel size W × H of the second encoding unit is: 200 × 400, 32 in number, and the output pixel size W × H of the third encoding unit is: 100 × 200, and the number is 32, so that a feature map with a size of 100 × 200 of 32 pixels is obtained by the encoder.
S530, the hidden layer conducts expansion convolution processing with different expansion rates on the downsampled feature map for multiple times to obtain an expansion feature map.
Illustratively, the hidden layer comprises a linking module, the linking module adopts expansion convolution to collect multidimensional context information, and the linking module performs expansion convolution processing with different expansion rates on the downsampling feature map for multiple times, so that the resolution is not lost when a receiving interval (also called a receptive field) is greatly increased, the receiving interval can be efficiently expanded, small parameters and the number of layers are kept, and the requirement on the storage space of the full convolution neural network model is effectively reduced. Especially for the application, the resolution ratio of the characteristic diagram is high, the requirement on the storage space of the full convolution neural network model can be effectively reduced while the resolution ratio is not lost through the connection module, and the road prediction efficiency and the road prediction accuracy are ensured.
And S540, the hidden layer performs upsampling processing on the expansion characteristic diagram to obtain an upsampling characteristic diagram, wherein the size of the upsampling characteristic diagram is the same as that of a single-channel image.
The expanded feature map obtained in step S530 is upsampled to restore the feature map to an input size, for example, the hidden layer includes a decoder, and the decoder includes, for example, a max-pooling layer and two convolution layers, in an embodiment, the decoder is as shown in fig. 5, the decoder includes a first decoding unit, a second decoding unit, a third decoding unit, and a fourth decoding unit, and an output pixel size W × H of the first decoding unit is: 200 × 400, 32 in number, and the output pixel size W × H of the second decoding unit is: 200 × 400, 32 in number, and the output pixel size W × H of the third decoding unit is: 200 × 400, the number is 2, and the output pixel size W × H of the fourth decoding unit is: 200 x 400, the number is 2, thus obtaining a characteristic diagram with 2 pixels of size 200 x 400 by the decoder.
And S550, performing feature mapping on the up-sampling feature map by an output layer of the full convolution neural network model, and outputting a confidence map, wherein the confidence map is used for representing the probability of each pixel point in the road region.
Through feature mapping, a confidence map representing the probability of each pixel point in the road area can be obtained, and then the road area in the target scene is predicted, for example, the area where each pixel point with the probability higher than 70% in the road area is located is determined as the road area.
In an exemplary embodiment, step S530 specifically includes:
the hidden layer performs N times of expansion convolution processing on the downsampled feature map, wherein N is a positive integer greater than 2, the expansion rate of the first time of expansion convolution processing is (1,1), and the expansion rate of the nth time of expansion convolution processing is (2) n-1 ,2 n ) N is more than 1 and less than or equal to N, and N is a positive integer.
In this embodiment, the downsampled feature map is exponentially expanded, so that the acceptance interval is also exponentially increased without reducing the coverage, and the feature map is zero-padded, so that the accuracy is not lost. Wherein the width and height growth rates of the acceptance interval are different to match the aspect ratio of the input image.
Illustratively, the structure of the concatenate module in the hidden layer is shown in fig. 6, and in combination with the following table, the concatenate module includes 1 st to 8 th expansion convolutional layers, each layer is used for performing one expansion convolution on the downsampled feature map, where the filter sizes of the 1 st to 7 th expansion convolutional layers are all 3 × 3, and the output pixel size W × H is: 100 × 200, the number of output feature maps is 128, the size of the 8 th expanded convolutional layer filter is 1 × 1, and the output pixel size W × H is: 100 × 200, the number of output characteristic maps is 32, the activation function used for each layer is an ELU (Exponential Linear Units) activation function, the expansion coefficient of the 1 st expanded convolution layer is (1,1), the acceptance interval is 3 × 3, the expansion coefficient of the 2 nd expanded convolution layer is (1,2), the acceptance interval is 5 × 7, the expansion coefficient of the 3 rd expanded convolution layer is (2,4), the acceptance interval is 9 × 15, the expansion coefficient of the 4 th expanded convolution layer is (4,8), the acceptance interval is 17 × 31, the expansion coefficient of the 5 th expanded convolution layer is (8, 16), the acceptance interval is 33 × 63, the expansion coefficient of the 6 th expanded convolution is (16, 32), the acceptance interval is 65 × 127, the expansion coefficient of the 7 th expanded convolution layer is (32, 64), and the acceptance interval is 129 × 255.
Figure 474285DEST_PATH_IMAGE001
As can be seen from the above table in conjunction with fig. 6, through the processing of the multiple expansion convolution layers of the linking module, the acceptance interval of the final expansion convolution layer is far greater than that of the input feature map, and the pixel size is 100 × 200, so that the full convolution neural network model can access a very large linking window to accurately determine whether each pixel point is in the road area.
In the application, the point cloud 200 of the target scene is projected to the two-dimensional grid plane 100, the input image is generated according to each point projected to the grid plane 100 and the statistical information of the point cloud, the input image is input into the semantic segmentation model, the road area in the target scene is predicted, the point cloud 200 is projected to the two-dimensional grid plane 100, the classification problem is simplified into a single-dimensional problem, and therefore the prediction of the road area can be rapidly realized through the simple semantic segmentation model, the road detection process is simplified, and the road detection efficiency is improved.
The present application further provides a lidar-based road detection apparatus, as shown in fig. 7, the lidar-based road detection apparatus includes an acquisition module 301, a mesh plane generation module 302, a projection module 303, an input image generation module 304, and a processing module 305. The acquisition module 301 is configured to acquire a point cloud of the laser radar for a target scene, and the grid plane generation module 302 is configured to generate a two-dimensional grid plane based on the point cloud of the target scene; the projection module 303 is configured to project the point cloud to a grid plane, so as to obtain a projection result, where the projection result includes points included in each grid in the grid plane and statistical information thereof; the input image generation module 304 is configured to generate an input image based on the projection result, pixel points in the input image correspond to grids in the grid plane one to one, and pixel values of the pixel points are determined based on statistical information of points included in the corresponding grids; the processing module 305 is configured to input the input image into a pre-trained semantic segmentation model to predict road regions in the target scene.
Fig. 8 is a block diagram illustrating a lidar-based road detection device, namely a computer device 400, according to an exemplary embodiment. Referring to fig. 8, the computer device 400 includes a processor 401, and the number of processors may be set to one or more as necessary. The computer device 400 further comprises a memory 402 for storing instructions, e.g. application programs, executable by the processor 401. The number of the memories 402 may be set to one or more as necessary. Which may store one or more application programs. The processor 401 is configured to execute instructions to perform the above-described method.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus (device), or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied in the medium. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, including, but not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computer, and the like. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 402 comprising instructions, executable by the processor 401 of the computer device 400 to perform the above-described method is provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A computer-readable storage medium in which instructions, when executed by a processor 401 of a lidar-based road detection device, cause the lidar-based road detection device to perform:
acquiring a point cloud of a laser radar for a target scene;
generating a two-dimensional grid plane based on the point cloud of the target scene;
projecting the point cloud to a grid plane to obtain a projection result, wherein the projection result comprises points contained in each grid in the grid plane and statistical information of the points;
generating an input image based on the projection result, wherein pixel points in the input image correspond to grids in a grid plane one by one, and the pixel values of the pixel points are determined based on the statistical information of the points contained in the corresponding grids;
and inputting the input image into a pre-trained semantic segmentation model, and predicting a road area in a target scene.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that an article or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such article or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of another like element in an article or device comprising the element.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, it is intended that the present application also include such modifications and variations as come within the scope of the appended claims and their equivalents.

Claims (10)

1. A road detection method based on laser radar is characterized by comprising the following steps:
acquiring a point cloud of a laser radar for a target scene;
generating a two-dimensional grid plane based on the point cloud of the target scene;
projecting the point cloud to the grid plane to obtain a projection result, wherein the projection result comprises points and statistical information thereof contained in each grid in the grid plane;
generating an input image based on the projection result, wherein pixel points in the input image correspond to grids in the grid plane one by one, and the pixel values of the pixel points are determined based on the statistical information of the points contained in the corresponding grids;
and inputting the input image into a pre-trained semantic segmentation model, and predicting a road area in the target scene.
2. The lidar based road detection method of claim 1, wherein the two dimensional grid plane is a top view perspective plane.
3. The lidar-based road detection method of claim 1, wherein the statistical information corresponding to the grid comprises a number of points included in the grid, an average reflectivity, a mean, a standard deviation, a minimum elevation angle, and a maximum elevation angle.
4. The lidar based road detection method according to any one of claims 1 to 3, wherein each of the meshes corresponds to a plurality of different statistical information, and the input image comprises a plurality of single channel images, and each of the single channel images corresponds to one statistical information.
5. The lidar-based road detection method of claim 4, wherein the semantic segmentation model is a full convolution neural network model.
6. The lidar-based road detection method according to claim 5, wherein the inputting the input image into a pre-trained semantic segmentation model to predict the road region in the target scene comprises:
inputting the plurality of single-channel images into a multi-channel input layer of the full convolution neural network model, wherein each channel corresponds to one single-channel image;
a hidden layer of the full convolution neural network model performs downsampling processing on a plurality of input single-channel images to obtain a downsampling feature map;
the hidden layer carries out expansion convolution processing with different expansion rates on the downsampling feature map for multiple times to obtain an expansion feature map;
the hidden layer carries out up-sampling processing on the expansion characteristic diagram to obtain an up-sampling characteristic diagram, wherein the size of the up-sampling characteristic diagram is the same as that of the single-channel image;
and an output layer of the full convolution neural network model performs feature mapping on the up-sampling feature map and outputs a confidence map, wherein the confidence map is used for representing the probability of each pixel point in the road area.
7. The lidar-based road detection method according to claim 6, wherein the hidden layer performs dilation convolution processing with different dilation rates on the down-sampling feature map for a plurality of times to obtain a dilation feature map, and the method comprises:
the hidden layer carries out expansion convolution processing on the downsampling feature map for N times, wherein N is a positive integer larger than 2;
the expansion rate of the first expansion convolution processing is (1,1), and the expansion rate of the nth expansion convolution processing is(2 n-1 ,2 n ) N is more than 1 and less than or equal to N, and N is a positive integer.
8. A lidar-based road detection apparatus, comprising:
the acquisition module is configured to acquire a point cloud of the laser radar for a target scene;
a mesh plane generation module configured to generate a two-dimensional mesh plane based on the point cloud of the target scene;
the projection module is configured to project the point cloud to the grid plane to obtain a projection result, and the projection result comprises points contained in each grid in the grid plane and statistical information of the points;
an input image generation module configured to generate an input image based on the projection result, where pixel points in the input image correspond to grids in the grid plane one to one, and pixel values of the pixel points are determined based on statistical information of points included in the corresponding grids;
and the processing module is configured to input the input image into a pre-trained semantic segmentation model and predict a road area in the target scene.
9. A computer-readable storage medium, having stored thereon a computer program, characterized in that the computer program, when being executed, is adapted to carry out the steps of the lidar based road detection method according to any of claims 1 to 7.
10. A computer arrangement comprising a processor, a memory and a computer program stored on the memory, characterized in that the processor, when executing the computer program, carries out the steps of the lidar based road detection method according to any of claims 1-7.
CN202211056479.1A 2022-08-30 2022-08-30 Laser radar-based road detection method, device, medium and computer equipment Pending CN115223129A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211056479.1A CN115223129A (en) 2022-08-30 2022-08-30 Laser radar-based road detection method, device, medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211056479.1A CN115223129A (en) 2022-08-30 2022-08-30 Laser radar-based road detection method, device, medium and computer equipment

Publications (1)

Publication Number Publication Date
CN115223129A true CN115223129A (en) 2022-10-21

Family

ID=83617140

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211056479.1A Pending CN115223129A (en) 2022-08-30 2022-08-30 Laser radar-based road detection method, device, medium and computer equipment

Country Status (1)

Country Link
CN (1) CN115223129A (en)

Similar Documents

Publication Publication Date Title
CN109902677B (en) Vehicle detection method based on deep learning
JP2021532442A (en) Target detection method and device, smart operation method, device and storage medium
CN111524150B (en) Image processing method and device
CN111126359A (en) High-definition image small target detection method based on self-encoder and YOLO algorithm
CN110163207B (en) Ship target positioning method based on Mask-RCNN and storage device
CN111986472B (en) Vehicle speed determining method and vehicle
CN113076871A (en) Fish shoal automatic detection method based on target shielding compensation
CN115035295B (en) Remote sensing image semantic segmentation method based on shared convolution kernel and boundary loss function
CN112580561B (en) Target detection method, target detection device, electronic equipment and storage medium
CN111860072A (en) Parking control method and device, computer equipment and computer readable storage medium
CN114648640B (en) Target object monomer method, device, equipment and storage medium
CN113761999A (en) Target detection method and device, electronic equipment and storage medium
EP3767332A1 (en) Methods and systems for radar object detection
CN112802197A (en) Visual SLAM method and system based on full convolution neural network in dynamic scene
CN115424017A (en) Building internal and external contour segmentation method, device and storage medium
CN115587987A (en) Storage battery defect detection method and device, storage medium and electronic equipment
CN115019274A (en) Pavement disease identification method integrating tracking and retrieval algorithm
CN116486312B (en) Video image processing method and device, electronic equipment and storage medium
CN113160117A (en) Three-dimensional point cloud target detection method under automatic driving scene
CN114565953A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN115223129A (en) Laser radar-based road detection method, device, medium and computer equipment
CN115035296A (en) Flying vehicle 3D semantic segmentation method and system based on aerial view projection
CN115457274A (en) Vehicle-mounted view angle shielding target detection method and device based on deep learning
CN115222767A (en) Space parking stall-based tracking method and system
CN114417946A (en) Target detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination