CN111353969B

CN111353969B - Method and device for determining road drivable area and computer equipment

Info

Publication number: CN111353969B
Application number: CN201811562743.2A
Authority: CN
Inventors: 曾钰廷; 徐琥; 文驰
Original assignee: Changsha Intelligent Driving Research Institute Co Ltd
Current assignee: Changsha Intelligent Driving Research Institute Co Ltd
Priority date: 2018-12-20
Filing date: 2018-12-20
Publication date: 2023-09-26
Anticipated expiration: 2038-12-20
Also published as: CN111353969A

Abstract

The application relates to a method for determining a road drivable area, which comprises the following steps: acquiring point cloud data to be processed; performing polar coordinate rasterization on the point cloud data to be processed to obtain a polar coordinate raster image; converting a data arrangement form based on the polar coordinate grid diagram to obtain a target grid diagram, wherein the target grid diagram comprises second grids respectively corresponding to the first grids in the polar coordinate grid diagram, each second grid comprises space points in the corresponding first grid, and each second grid is arranged in an arrangement form of image pixels; determining target statistical parameters of each second grid based on the space points in each second grid respectively to obtain a statistical feature map corresponding to the target statistical parameters; semantic segmentation is carried out on the basis of the statistical feature map through a preset convolutional neural network to obtain confidence degrees of the second grids respectively belonging to the ground category and the non-ground category; and determining a road drivable area in the point cloud data to be processed based on each confidence. The scheme provided by the application can improve the working accuracy.

Description

Method and device for determining road drivable area and computer equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and apparatus for determining a road drivable area, a computer readable storage medium, and a computer device.

Background

Detecting the road drivable area is an important research content in the field of environmental awareness. Generally, the environment sensing device may be used to sense the surrounding road environment, so as to obtain corresponding point cloud data, and further determine the road drivable area in the point cloud data.

In the traditional mode, the point cloud data is divided into a plurality of segments based on the height difference, and then straight line fitting is carried out, so that the road drivable area in the point cloud data is determined. However, the accuracy of the road drivable area determined in the conventional manner is not high, and is applicable only to road environments that are simply clear and have obvious road edges.

Disclosure of Invention

Based on this, it is necessary to provide a method, an apparatus, a computer readable storage medium and a computer device for determining a road drivable area, aiming at the technical problems that the accuracy is not high in the conventional manner and the applicable scene has a certain limitation.

A method of determining a road drivable area, comprising:

acquiring point cloud data to be processed;

Performing polar coordinate rasterization on the point cloud data to be processed to obtain a polar coordinate raster image;

performing data arrangement form conversion based on the polar coordinate raster pattern to obtain a target raster pattern; the target grid graph comprises second grids corresponding to the first grids in the polar coordinate grid graph respectively, each second grid comprises space points in the corresponding first grid, and each second grid is arranged in an image pixel arrangement mode;

determining target statistical parameters of the second grids based on the space points in the second grids respectively, and obtaining a statistical feature map corresponding to the target statistical parameters;

semantic segmentation is carried out on the basis of the statistical feature map through a preset convolutional neural network, so that confidence degrees of the second grids respectively belonging to the ground category and the non-ground category are obtained;

and determining a road drivable area in the point cloud data to be processed based on the confidence that each second grid respectively belongs to the ground category and the non-ground category.

A road drivable area determining apparatus comprising:

the point cloud acquisition module is used for acquiring point cloud data to be processed;

the polar coordinate grid diagram acquisition module is used for carrying out polar coordinate rasterization on the point cloud data to be processed to obtain a polar coordinate grid diagram;

The arrangement form conversion module is used for carrying out data arrangement form conversion based on the polar coordinate grid graph to obtain a target grid graph; the target grid graph comprises second grids corresponding to the first grids in the polar coordinate grid graph respectively, each second grid comprises space points in the corresponding first grid, and each second grid is arranged in an image pixel arrangement mode;

the statistical feature map acquisition module is used for determining target statistical parameters of the second grids based on the space points in the second grids respectively and obtaining a statistical feature map corresponding to the target statistical parameters;

the confidence degree determining module is used for carrying out semantic segmentation on the basis of the statistical feature map through a preset convolutional neural network to obtain confidence degrees of the second grids respectively belonging to the ground category and the non-ground category;

and the drivable area determining module is used for determining the road drivable area in the point cloud data to be processed based on the confidence that each second grid respectively belongs to the ground category and the non-ground category.

A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method of determining a road drivable area as described above.

A computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the method of determining a road drivable area as described above.

The method, the device, the computer readable storage medium and the computer equipment for determining the road drivable region comprise the steps of performing polar coordinate rasterization on point cloud data to be processed to obtain a polar coordinate raster image, performing data arrangement form conversion on the basis of the polar coordinate raster image to obtain a target raster image, determining target statistical parameters of each second raster on the basis of space points in each second raster in the target raster image respectively, obtaining a statistical feature image corresponding to the target statistical parameters, performing semantic segmentation on the basis of the statistical feature image through a preset convolutional neural network to obtain confidence degrees that each second raster belongs to a ground category and a non-ground category respectively, and determining the road drivable region in the point cloud data to be processed on the basis of the confidence degrees that each second raster belongs to the ground category and the non-ground category respectively. Therefore, the distribution state of the polar grid graph is distributed in an arc shape, and the area covered by the area far from the pole is larger, so that the polar grid graph can truly describe the characteristics that the ground points of the point cloud data are distributed in an arc shape and the data are gradually sparse from near to far, and the accuracy of determining the road drivable area is improved. In addition, the predetermined convolutional neural network obtained through training based on mass sample point cloud data is subjected to semantic segmentation based on the statistical feature map, so that the method can be suitable for describing the point cloud data of various road environments, and the accuracy is further improved.

Drawings

FIG. 1 is an application environment diagram of a method for determining a road drivable area in one embodiment;

FIG. 2 is an application environment diagram of a method for determining a road drivable area in one embodiment;

FIG. 3 is a flow chart illustrating a method for determining a road drivable area in one embodiment;

FIG. 4 is a schematic diagram of point cloud data obtained by laser radar scanning in one embodiment;

FIG. 5 is a schematic diagram of a polar grid diagram in one embodiment;

FIG. 6 is a schematic diagram of a target raster pattern in one embodiment;

FIG. 7 is a schematic diagram of the structure of a predetermined convolutional neural network in one embodiment;

FIG. 8 is a schematic diagram of the structure of a predetermined convolutional neural network in one embodiment;

FIG. 9 is a block diagram showing the construction of a road drivable area determining apparatus in one embodiment;

FIG. 10 is a block diagram of a computer device in one embodiment;

FIG. 11 is a block diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

It should be noted that the terms "first," "second," and the like are used herein to make a naming distinction between similar objects, but the objects themselves are not limited by these terms.

The method for determining the road drivable area provided by the embodiments of the present application can be applied to an application environment as shown in fig. 1. The application environment may involve a control terminal 110 and an environment-aware device 120, the environment-aware device 120 being connected to the control terminal 110 by wired or wireless means.

Specifically, the environment sensing device 120 may sense the surrounding environment to obtain point cloud data, and send the point cloud data to the control terminal 110. The control terminal 110 obtains point cloud data to be processed based on the point cloud data sent by the environment sensing device 120; performing polar coordinate rasterization on the point cloud data to be processed to obtain a polar coordinate raster image; further, converting a data arrangement form based on the polar grid diagram to obtain a target grid diagram, wherein the target grid diagram comprises second grids corresponding to the first grids in the polar grid diagram respectively, each second grid comprises space points in the corresponding first grid, and each second grid is arranged in an arrangement form of image pixels; the method comprises the steps of carrying out a first treatment on the surface of the Then, determining target statistical parameters of each second grid based on the space points in each second grid respectively, and obtaining a statistical feature map corresponding to the target statistical parameters; further, semantic segmentation is carried out based on the statistical feature diagram through a preset convolutional neural network, and the confidence that each second grid belongs to the ground category and the non-ground category is determined; and then, determining the road drivable area in the point cloud data to be processed based on the confidence that each second grid respectively belongs to the ground category and the non-ground category.

In other embodiments, as shown in fig. 2, the application environment may also involve a server 130, where the server 130 is connected to the control terminal 110 through a network. Accordingly, the environmental sensing device 120 may sense the surrounding environment to obtain the point cloud data, and transmit the point cloud data to the control terminal 110. The control terminal 110 transmits the point cloud data transmitted from the environment sensing device 120 to the server 130. The server 130 obtains point cloud data to be processed based on the point cloud data sent by the control terminal 110, performs a step of performing polar coordinate rasterization processing on the point cloud data to be processed to determine a road drivable region in the point cloud data to be processed, and sends a determination result of the road drivable region to the control terminal 110.

In addition, the control terminal 110 and the environment sensing device 120 may be provided on an unmanned vehicle, an unmanned plane, and a robot, but are not limited thereto. The server 130 may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.

In one embodiment, as shown in FIG. 3, a method of determining a road drivable area is provided. The method is described as being applied to a computer device (such as the control terminal 110 or the server 130 in fig. 1 described above) as an example. The method may include the following steps S302 to S312.

S302, obtaining point cloud data to be processed.

The point cloud data to be processed is the point cloud data of the road drivable area which needs to be determined. The point cloud data to be processed can be used for describing the road environment, and the road environment in the perception range can be perceived by the environment perception device. Specifically, the point cloud data to be processed may include attribute information of more than 1 spatial point, and the attribute information of a spatial point may include three-dimensional coordinates (abscissa X, ordinate Y, and ordinate Z) of the spatial point, laser reflection Intensity (Intensity) of the spatial point, and color information (RGB) of the spatial point.

In one embodiment, the environment sensing device may be a laser radar, according to which the laser radar may scan the road environment within the sensing range to obtain point cloud data, and output the point cloud data to the computer device. And the computer equipment takes the point cloud data output by the laser radar as point cloud data to be processed. As shown in fig. 4, the point cloud data to be processed may be a top view of the point cloud data output by the laser radar. Further, the lidar may be a multi-line lidar such as a 16-line lidar, a 40-line lidar, a 60-line lidar, and the like.

In another embodiment, the environment sensing device may be a binocular camera device, so that the binocular camera device may shoot the road environment within the sensing range to obtain the point cloud data, and output the point cloud data to the computer device. And the computer equipment performs dimension reduction processing on the point cloud data output by the binocular camera device, and the point cloud data obtained after dimension reduction can be point cloud data to be processed. The binocular camera device can be realized by adopting two independent cameras, and can also be realized by adopting a binocular camera with a left camera and a right camera which are packaged.

And S304, performing polar coordinate rasterization on the point cloud data to be processed to obtain a polar coordinate raster image.

And the polar coordinate rasterization is to project each space point of the point cloud data to be processed into an initial grid matched with the space point in the initial polar coordinate raster pattern so as to obtain the polar coordinate raster pattern. Specifically, the initial polar grid diagram is a pre-configured template corresponding to the polar grid diagram, the initial polar grid diagram comprises more than 1 initial grids, each space point of the point cloud data to be processed can be respectively projected into the initial grids matched with the space point in the initial polar grid diagram, and thus the polar grid diagram is obtained. It will be appreciated that the polar grid map includes first grids corresponding to initial grids in the initial polar grid map, respectively, and each initial grid after projection is completed is the corresponding first grid.

In the polar grid diagram, each first grid is distributed in an arc shape. As shown in fig. 5, a polar grid diagram with a maximum angle of 180 ° and a maximum distance of 60M is shown, wherein the polar grid diagram includes m×n first grids, respectively (x ₁ ,y ₁ )、(x ₂ ,y ₁ )、…、(x _M ,y ₁ )、(x ₁ ,y ₂ )、(x ₂ ,y ₂ )、…、(x _M ,y ₂ )、…、(x _M ,y _N ). It will be appreciated that fig. 5 is merely illustrative, and that the maximum angles and maximum distances of the polar grid map may be preset according to practical needs, for example, the maximum angles may be 360 °, the maximum distances may be 80 meters, etc.

S306, converting the data arrangement form based on the polar coordinate raster pattern to obtain a target raster pattern.

The target raster pattern is obtained by arranging the space points in each first raster in the polar raster pattern in the form of arrangement of image pixels. Specifically, the target raster pattern may include respective second grids corresponding to respective first grids in the polar raster pattern, each second grid including spatial points within the respective corresponding first grids, each second grid being arranged in an arrangement of image pixels. Wherein the arrangement of the image pixels may be in the form of a matrix arrangement.

As shown in fig. 6, the target raster pattern obtained by converting the data arrangement form based on the polar raster pattern shown in fig. 5 is shown in fig. 6 (x ₁ ,y ₁ ) The spatial points contained in the second grid represented are identical to those of the first grid (x in fig. 5 ₁ ,y ₁ ) The spatial points contained in the indicated first grids are the same, and the spatial points contained in other second grids are the same as the spatial points contained in the corresponding first grids, and are not described in detail herein.

S308, determining target statistical parameters of the second grids based on the space points in the second grids respectively, and obtaining a statistical feature map corresponding to the target statistical parameters.

The target statistical parameter may be an attribute parameter of the data measurement. The target statistical parameter may include at least one of a total number of spatial points within the grid, an average height of spatial points within the grid, a maximum height of spatial points within the grid, a minimum height of spatial points within the grid, an average reflected intensity of spatial points within the grid, and a standard deviation of spatial points within the grid.

Specifically, the total number of spatial points within a grid is the total number of spatial points belonging to the grid. The average height of the spatial points in the grid is the average of the vertical coordinates of the spatial points belonging to the grid. The maximum height of a space point in a grid is the largest vertical coordinate among the vertical coordinates of the space points belonging to the grid. The minimum height of a spatial point within a grid is the smallest vertical coordinate among the vertical coordinates of the spatial points belonging to the grid. The average reflection intensity of the spatial points in the grid is an average value of the laser reflection intensities of the spatial points belonging to the grid.

The target statistical parameters and the statistical feature graphs can be in one-to-one relation, and the number of types of the target statistical parameters is consistent with the number of the statistical feature graphs. For example, when the target statistical parameters include 3 kinds of points including the total number of the space points in the grid, the average height of the space points in the grid and the average reflection intensity of the space points in the grid, for any one point cloud data to be processed, 3 statistical feature maps corresponding to the 3 kinds of target statistical parameters can be obtained. For another example, when the target statistical parameters include the 6 types of target statistical parameters, 6 statistical feature graphs corresponding to the 6 types of target statistical parameters can be obtained for any point cloud data to be processed.

In this embodiment, the statistical feature map corresponding to the target statistical parameter may be obtained based on the target statistical parameter of each second grid in the target grid map. Taking the target statistical parameter of maximum height of space point in the grid as an example, based on the target statistical parameters of each second grid in the target grid graph, the process of obtaining the statistical feature graph corresponding to the target statistical parameters can be as follows: for each second grid in the target grid graph, taking the largest vertical coordinate in the vertical coordinates of each space point in the second grid as the maximum height of the space point in the second grid, so as to determine the maximum height of the space point in each second grid; and determining a statistical feature map corresponding to the target statistical parameter of maximum height of the space points in the grid according to the maximum height of the space points in each second grid in the target grid map. In addition, the manner of determining the statistical feature map corresponding to the other kinds of target statistical parameters is similar, and is not described herein.

It should be noted that, the feature statistical graph is determined based on the target statistical parameters of each second grid in the target grid graph, and the target grid graph is obtained based on the polar coordinate grid graph corresponding to the point cloud data to be processed, so that the statistical feature graph corresponds to the point cloud data to be processed, and can be used as a preliminary feature of the point cloud data to be processed to characterize a part of the characteristics of the point cloud data to be processed.

S310, semantic segmentation is carried out based on the statistical feature diagram through a preset convolutional neural network, and confidence that each second grid respectively belongs to the ground category and the non-ground category is obtained.

The predetermined convolutional neural network is a machine learning model with an image semantic segmentation function. The predetermined convolutional neural network is obtained by performing model training on the convolutional neural network to be trained based on sample point cloud data.

The image semantic segmentation is to determine semantic category information corresponding to each pixel in the image. In this embodiment, the statistical feature map may be input into a predetermined convolutional neural network, and semantic segmentation is performed based on the statistical feature map through the predetermined convolutional neural network, so as to obtain semantic category information corresponding to each pixel in the statistical feature map. It should be noted that, each pixel in the statistical feature map corresponds to each second grid in the target grid map (for example, the target grid map shown in fig. 6 includes m×n second grids, and the spatial scale of the statistical feature map obtained according to this is m×n, that is, the statistical feature map includes m×n pixels, where m×n pixels in the statistical feature map correspond to m×n second grids in the target grid map, respectively), so that semantic category information corresponding to each pixel in the statistical feature map is semantic category information corresponding to each second grid in the target grid map, respectively.

The semantic category information corresponding to the second grid may include a confidence that the second grid belongs to a ground category and a confidence that the second grid belongs to a non-ground category. The confidence that the second grid belongs to the ground category can be used for describing the possibility that the second grid belongs to the ground category, and the higher the confidence is, the higher the possibility that the second grid belongs to the ground category is, and conversely, the lower the confidence is, the lower the possibility that the second grid belongs to the ground category is; furthermore, the second grid may belong to the ground class, i.e. the spatial points within the second grid are ground points. The confidence that the second grid belongs to a non-ground class is similar and will not be described in detail here.

S312, determining a road drivable area in the point cloud data to be processed based on the confidence that each second grid respectively belongs to the ground category and the non-ground category.

The road drivable area is an area available for running in the road environment.

In this embodiment, specifically, the second grid belonging to the ground category may be determined from the second grids in the target grid graph based on the confidence that each second grid in the target grid graph respectively belongs to the ground category and the non-ground category. Further, a road drivable region in the point cloud data to be processed is determined based on the second grid belonging to the ground class.

According to the method for determining the road drivable region, polar coordinate rasterization is conducted on point cloud data to be processed to obtain a polar coordinate raster image, data arrangement form conversion is conducted on the basis of the polar coordinate raster image to obtain a target raster image, then target statistical parameters of second grids are determined on the basis of space points in the second grids in the target raster image respectively, statistical feature images corresponding to the target statistical parameters are obtained, semantic segmentation is conducted on the basis of the statistical feature images through a preset convolution neural network to obtain confidence degrees that the second grids belong to a ground category and a non-ground category respectively, and the road drivable region in the point cloud data to be processed is determined on the basis of the confidence degrees that the second grids belong to the ground category and the non-ground category respectively. Therefore, the distribution state of the polar grid graph is distributed in an arc shape, and the area covered by the area far from the pole is larger, so that the polar grid graph can truly describe the characteristics that the ground points of the point cloud data are distributed in an arc shape and the data are gradually sparse from near to far, and the accuracy of determining the road drivable area is improved. In addition, the predetermined convolutional neural network obtained based on mass sample point cloud training performs semantic segmentation based on the statistical feature map, can be suitable for describing point cloud data of various road environments, and further improves accuracy.

In one embodiment, the step of performing polar rasterization on the point cloud data to be processed to obtain a polar grid map, that is, step S304, may include the following steps: determining distance information and angle information of each space point of the point cloud data to be processed under a polar coordinate system based on the abscissa and the ordinate of each space point of the point cloud data to be processed; and projecting each space point of the point cloud data to be processed into an initial grid in which the distance coverage area in the initial polar coordinate grid map is matched with the distance information of the space point in the polar coordinate system and the angle coverage area is matched with the angle information of the space point in the polar coordinate system, so as to obtain the polar coordinate grid map.

The initial polar grid may be described by 4 parameters, namely, maximum angle (max_angle), maximum distance (max_range), angular resolution (angle_size) and distance resolution (range_size), which 4 parameters may be set based on actual requirements. Wherein the angular resolution (angle_size) is the angle covered by each initial grid in the initial polar grid map; the range resolution (range_size) is the distance covered by each initial grid in the initial polar grid map. It will be appreciated that after setting these 4 parameters of the initial polar grid map, the total number of initial grids in the initial polar grid map can be determined (i.e ) And the angular coverage and distance coverage of each initial grid.

By way of example, assume that the initial polar grid plot has a maximum angle of 180 °, an angular resolution of 10 °, a maximum distance of 60 meters, and a distance resolution of 5 meters. The initial grid a11 located at the start position in the initial polar grid map has an angular coverage of 0 ° to 10 ° and a distance coverage of 0 meters to 5 meters; an initial grid a21 adjacent to the initial grid a11 in the lateral direction, having an angular coverage of 10 ° to 20 °, and a distance coverage of 0 meters to 5 meters; an initial grid a12 adjacent to the initial grid a11 in the longitudinal direction has an angular coverage of 0 ° to 10 ° and a distance coverage of 5 meters to 10 meters.

In this embodiment, for each spatial point of the point cloud data to be processed, angle information and distance information of the spatial point with respect to an origin of the environment sensing device (e.g., an origin of a laser radar) are determined based on an abscissa and an ordinate of the spatial point, angle information of the spatial point in a polar coordinate system is determined based on angle information and angle resolution with respect to the origin of the environment sensing device, and distance information of the spatial point in the polar coordinate system is determined based on distance information and distance resolution with respect to the origin of the environment sensing device.

Specifically, for any spatial point i of the point cloud data to be processed, the spatial point i is relative to angle information point [ i ] of origin of the environment sensing device]Angle and distance information point [ i ]]Range, and angle information a of spatial point i in polar coordinate system _i Distance information R _i The determination may be based on the following formula:

wherein point [ i ]]X represents the abscissa of the spatial point i; point [ i ]]Y represents the ordinate of the spatial point i; int () represents a rounding function; a is that _i The number range of (c) is 0,]；R _i the numerical range of (2) is [0, ]>]。

For each space point of the point cloud data to be processed, after the angle information and the distance information of the space point under the polar coordinate system are determined, the initial grid matched with the space point can be determined from all initial grids in the initial polar coordinate grid graph, and the space point is projected into the initial grid matched with the initial grid, so that the polar coordinate grid graph is obtained. If the angle information of the space point in the polar coordinate system falls within the angle coverage of the initial grid, and the distance information of the space point in the polar coordinate system falls within the distance coverage of the initial grid, the initial grid is matched with the space point.

In one embodiment, the step of determining the confidence that each second grid belongs to the ground class and the non-ground class respectively, that is, the step S310, by performing semantic segmentation based on the statistical feature map through a predetermined convolutional neural network, may include the following steps: sequentially carrying out feature extraction on the statistical feature map through a first feature extraction network, a second feature extraction network, a third feature extraction network and a fourth feature extraction network in a preset convolutional neural network; performing feature aggregation based on the output of the third feature extraction network and the output of the fourth feature extraction network through a first splicing network in a preset convolutional neural network; performing feature aggregation based on the output of the first spliced network and the output of the second feature extraction network through a second spliced network in the predetermined convolutional neural network; performing feature aggregation based on the output of the second spliced network and the output of the first feature extraction network through a third spliced network in the predetermined convolutional neural network; and obtaining a first semantic segmentation map and a second semantic segmentation map based on the output of the third splicing network through an output network in a preset convolutional neural network.

In the present embodiment, as shown in fig. 7, the predetermined convolutional neural network may include a first feature extraction network, a second feature extraction network, a third feature extraction network, a fourth feature extraction network, a first concatenation network, a second concatenation network, a third concatenation network, and an output network, which are sequentially connected. In addition, the first spliced network is also connected with a third feature extraction network, the second spliced network is also connected with the second feature extraction network, and the third spliced network is also connected with the first feature extraction network.

And the Feature extraction network can be used for carrying out Feature extraction processing based on the input of the Feature extraction network so as to obtain and output corresponding image features (Feature maps). The feature extraction process may include a convolution process.

Specifically, the first feature extraction network may be configured to perform feature extraction processing based on the statistical feature map, to obtain a first image feature. The second feature extraction network may be configured to perform feature extraction processing based on an output of the first feature extraction network (i.e., the first image feature) to obtain a second image feature. The third feature extraction network may be configured to perform feature extraction processing based on an output of the second feature extraction network (i.e., the second image feature) to obtain a third image feature. The fourth feature extraction network may be configured to perform feature extraction processing based on an output of the third feature extraction network (i.e., the third image feature) to obtain a fourth image feature.

The first image feature, the second image feature, the third image feature and the fourth image feature are all image features corresponding to the statistical feature map and can be used for representing the characteristics of the statistical feature map. Furthermore, the levels of image features gradually increase from the first image feature to the fourth image feature, the image features of the lower layer contain more detailed information, and the image features of the higher layer contain more semantic information.

The splicing network can be used for splicing the channel numbers of different layers, namely, carrying out feature aggregation processing on the image features output by different layers.

Specifically, the first stitching network may be configured to perform feature aggregation based on an output of the third feature extraction network (i.e., the third image feature) and an output of the fourth feature extraction network (i.e., the fourth image feature), to obtain the fifth image feature. The second stitching network may be configured to perform feature aggregation based on an output of the first stitching network (i.e., the fifth image feature) and an output of the second feature extraction network (i.e., the second image feature) to obtain a sixth image feature. And the third stitching network can be used for performing feature aggregation based on the output of the second stitching network (namely, the sixth image feature) and the output of the first feature extraction network (namely, the first image feature) to obtain a seventh image feature.

And an output network operable to obtain a first semantic segmentation map and a second semantic segmentation map based on an output of the third stitching network (i.e., the seventh image feature). The first semantic segmentation map comprises the confidence that each second grid belongs to the ground category, and the second semantic segmentation map comprises the confidence that each second grid belongs to the non-ground category.

Specifically, the first semantic segmentation map includes pixels corresponding to the second grids in the target grid map, where the pixel value of each pixel is used to represent the confidence that the pixel belongs to the ground class, that is, the confidence that the second grid corresponding to the pixel belongs to the ground class. The second semantic segmentation graph is similar and will not be described in detail here.

In this embodiment, the image features of different depths of the same scale are spliced through the splicing network, and the confidence that each second grid belongs to the ground category and the confidence that each second grid belongs to the non-ground category are determined by combining the detail information in the image features of the low level and the semantic information in the image features of the high level, so that the accuracy of the determined confidence can be effectively improved, and the accuracy of determining the drivable region of the road is improved.

In one embodiment, as shown in fig. 8, the first feature extraction network, the second feature extraction network, and the third feature extraction network may each include 3 convolution layers connected in sequence; the fourth feature extraction network may include 1 convolution layer and 1 deconvolution layer connected in sequence; the first splicing network and the second splicing network can comprise 1 splicing layer (Concat) and 2 deconvolution layers which are connected in sequence; the third splice network may include 1 splice layer and 1 deconvolution layer connected in sequence; the output network may include a convolutional layer.

Accordingly, the statistical feature map can be input into the first feature extraction network, convolution processing is performed on the 1 st convolution layer in the first feature extraction network, then convolution processing is performed on the 1 st convolution layer output through the 2 nd convolution layer, and then convolution processing is performed on the 2 nd convolution layer output through the 3 rd convolution layer.

Further, the output of the 3 rd convolution layer in the first feature extraction network (i.e., the first image feature) is input into the second feature extraction network, and an operation similar to that in the first feature extraction network is performed in the second feature extraction network. And inputting the output of the 3 rd convolution layer (namely the second image feature) in the second feature extraction network into a third feature extraction network, and performing similar operation in the third feature extraction network as in the first feature extraction network.

And then, inputting the output of the 3 rd convolution layer (namely third image feature) in the third feature extraction network into a fourth feature extraction network, carrying out convolution processing on the third image feature through the convolution layer in the fourth feature extraction network, and carrying out scale recovery processing on the output of the convolution layer through the deconvolution layer so as to recover the output of the convolution layer to the same spatial scale as the third image feature.

Further, the output of the 3 rd convolution layer (i.e. the third image feature) in the third feature extraction network and the output of the deconvolution layer (i.e. the fourth image feature) in the fourth feature extraction network are input into the first splicing network, the third image feature and the fourth image feature are subjected to feature aggregation processing through the splicing layer in the first splicing network, and then the output of the splicing layer is subjected to scale recovery processing through the 1 st deconvolution layer and the 2 nd deconvolution layer in sequence, so that the output of the splicing layer is recovered to the same spatial scale as the third image feature and the fourth image feature.

And inputting the output of the 2 nd deconvolution layer (namely the fifth image feature) in the first splicing network and the output of the 3 rd convolution layer (namely the second image feature) in the second feature extraction network into a second splicing network, carrying out feature aggregation processing on the fifth image feature and the second image feature through the splicing layer in the second splicing network, and then carrying out scale recovery processing on the output of the splicing layer through the 1 st deconvolution layer and the 2 nd deconvolution layer in sequence so as to recover the output of the splicing layer to the same spatial scale as the fifth image feature and the second image feature.

Further, the output of the 2 nd deconvolution layer (namely, the sixth image feature) in the second splicing network and the output of the 3 rd convolution layer (namely, the first image feature) in the first feature extraction network are input into a third splicing network, the sixth image feature and the first image feature are subjected to feature aggregation processing through the splicing layer in the third splicing network, and then the output of the splicing layer is subjected to scale recovery processing through the deconvolution layer, so that the output of the splicing layer is recovered to the same spatial scale as the sixth image feature and the first image feature.

And then, inputting the output of the deconvolution layer (namely, the seventh image feature) in the third splicing network into an output network, and carrying out convolution processing on the seventh image feature through the output network to obtain a first semantic segmentation map and a second semantic segmentation map.

In one specific example, the convolution Kernel size (Kernel size) of the convolution layer may be 3, the sliding step size (Stride) may be 2, and the expanded edge (Pad) may be 1.

In this embodiment, the predetermined convolutional neural network adopts a full convolutional form, and does not adopt a full connection layer, so that parameters of the neural network are effectively reduced, and a pooling layer is not used, so that information on an input image of the neural network is reserved to the greatest extent. Therefore, the operation speed of the neural network can be ensured, and the classification accuracy of the neural network can be considered.

In one embodiment, the manner in which the predetermined convolutional neural network is determined may include the steps of: acquiring sample point cloud data carrying class labels; performing polar coordinate rasterization on the sample point cloud data to obtain a sample polar coordinate raster image; converting the data arrangement form based on the sample polar coordinate raster image to obtain a sample target raster image; the sample target grid graph comprises sample second grids corresponding to the sample first grids in the sample polar coordinate grid graph respectively, each sample second grid comprises space points in the corresponding sample first grids, and each sample second grid is arranged in an arrangement mode of image pixels; determining target statistical parameters of the second grids of each sample based on space points in the second grids of each sample respectively, and obtaining a sample statistical feature map corresponding to the target statistical parameters; carrying out semantic segmentation based on a sample statistical feature map through a convolutional neural network to be trained, and determining the confidence that each sample second grid respectively belongs to a ground category and a non-ground category; determining a loss parameter based on confidence and category labels of the second grids of each sample belonging to the ground category and the non-ground category respectively; model training is carried out on the convolutional neural network to be trained based on the loss parameters, and a preset convolutional neural network is determined.

The sample point cloud data is the point cloud data with known categories to which each space point truly belongs. The sample point cloud data can carry a category label, and the category label is used for representing the category of each real space point of the sample point cloud data, wherein the category of the real space point is a ground category or a non-ground category. The category labels may be determined by manual analysis, for example, but not limited to, analysis of sample point cloud data by an expert in the related art.

The convolutional neural network to be trained is a machine learning model which needs model training, and the network framework of the machine learning model can be consistent with the framework of a preset convolutional neural network. Model training is an operation of adjusting model parameters of each layer in a machine learning model. In addition, the step of model training (i.e., the above-described steps from acquisition of sample point cloud data to determination of a predetermined convolutional neural network) may be performed on the server 130 shown in fig. 2 or may be performed on the control terminal 110 shown in fig. 2.

The model parameter adjustment is an iterative processing process, the model parameters of the convolutional neural network to be trained can be adjusted in batches by taking the batch as a unit, namely, the model parameters of the convolutional neural network to be trained are adjusted once according to one batch of sample point cloud data, the sample point cloud data of each batch are sequentially input into the convolutional neural network to be trained, after the model parameters are adjusted according to the sample point cloud data of each batch, the sample point cloud data of each batch are input into the convolutional neural network to be trained again, the model parameters are adjusted according to the sample point cloud data of each batch again, iteration is continued until the iteration stopping condition is met, and the model parameters when the training is stopped are the model parameters of the predetermined convolutional neural network. Wherein the number of frames of sample point cloud data (i.e., the Batch size parameter) included in each Batch may be determined based on the GPU (Graphics Processing Unit, graphics processor) performance of the device performing the model training step.

Performing polar coordinate rasterization on each frame of sample point cloud data in each batch to obtain a sample polar coordinate raster image, performing data arrangement form conversion based on the sample polar coordinate raster image to obtain a sample target raster image (the sample target raster image comprises sample second grids respectively corresponding to the sample first grids in the sample polar coordinate raster image, each sample second grid comprises space points in the sample first grids respectively corresponding to the sample second grids, each sample second grid is arranged in an image pixel arrangement form), further respectively determining target statistical parameters of each sample second grid based on the space points in each sample second grid, obtaining a sample statistical feature image corresponding to the target statistical parameters, performing semantic segmentation based on the sample statistical feature image through a convolutional neural network to be trained to obtain confidence that each sample second grid respectively belongs to a ground category and a non-ground category, and determining loss parameters corresponding to the sample point cloud data of the frame based on the confidence that each sample second grid respectively belongs to the ground category and the non-ground category and a category label. And further, based on loss parameters corresponding to each frame of sample point cloud data in the batch of sample point cloud data, adjusting model parameters of the convolutional neural network to be trained. And repeatedly adjusting model parameters of the machine model to be trained according to each batch of sample point cloud data until the training is stopped when the iteration stopping condition is met.

The iteration stop condition may be set based on an actual requirement, for example, a preset iteration number may be reached, or the calculated loss parameter may satisfy a predetermined condition, for example, the loss parameter is smaller than a predetermined loss threshold, or the calculated loss parameter is not reduced any more, or the like.

It should be noted that, before the sample point cloud data is input into the convolutional neural network to be trained for the first time, the convolutional neural network to be trained needs to be subjected to parameter initialization processing, so that each layer of the convolutional neural network to be trained has initial parameters.

In one embodiment, the method for determining the class label carried by the sample point cloud data may include the following steps: performing rectangular coordinate rasterization on the sample point cloud data to obtain a rectangular coordinate raster pattern; determining the maximum height of the space point in each third grid based on the vertical coordinates of the space point in each third grid in the rectangular coordinate grid diagram; taking the ground class as the class to which the space point in the third grid with the maximum height of the space point smaller than the height threshold belongs; taking the non-ground class as a class to which the space point in the third grid with the maximum height of the space point being equal to or greater than the height threshold belongs; and determining class labels carried by the sample point cloud data based on the classes of the space points in each third grid.

And the rectangular coordinate rasterization is to project each space point of the point cloud data to be processed under a rectangular coordinate system to obtain a rectangular coordinate raster pattern. The rectangular grid diagram comprises more than 1 third grid, and for each space point of the point cloud data to be processed, the rectangular grid diagram is provided with the corresponding third grids.

In this embodiment, rectangular coordinate rasterization is performed on sample point cloud data to obtain a rectangular coordinate raster image. And determining the maximum vertical coordinate of the vertical coordinates of the space points in each third grid in the rectangular grid chart as the maximum height of the space points in the third grid. Further, it is determined that a space point in the third grid in which the maximum height of the space point is smaller than the height threshold belongs to the ground class, and it is determined that a space point in the third grid in which the maximum height of the space point is equal to or larger than the height threshold belongs to the non-ground class. And then, determining class labels carried by the sample point cloud data based on the classes to which the space points in the third grids belong. The height threshold may be preset based on actual requirements.

It should be noted that, because the point cloud data to be processed is three-dimensional data, when the category to which the spatial point belongs is determined completely by manpower, the working difficulty is high, which results in high manpower cost and easy error occurrence. In this embodiment, the computer device performs rectangular rasterization on the sample point cloud data to obtain a rectangular grid map, and determines the category to which the space point in each third grid belongs based on the relationship between the maximum height of the space point in each third grid in the rectangular grid map and the height threshold value, so that the labor cost can be effectively reduced, and the accuracy of the determined category label can be improved.

In other embodiments, after determining the category to which the spatial point in each third grid belongs based on the relationship between the maximum height of the spatial point in each third grid and the height threshold in the rectangular grid diagram, the computer device may further check and correct the determination result of the computer device by a human, so as to further improve the accuracy of the determined category label.

In one embodiment, the step of determining the road drivable area in the point cloud data to be processed, that is, step S312, based on the confidence that each second grid belongs to the ground class and the non-ground class, respectively, may include the steps of: based on the confidence that each second grid belongs to the ground category and the non-ground category, determining the category to which each second grid belongs from the ground category and the non-ground category; determining the category of each corresponding first grid based on the category of each second grid; and determining a position area corresponding to the space point in the first grid belonging to the ground class in the point cloud data to be processed as a road drivable area.

In this embodiment, for each second grid in the target grid graph, the ground class or the non-ground class is taken as the class to which the second grid belongs, based on the confidence that the second grid belongs to the ground class and the non-ground class, respectively.

Further, as described above, each second grid in the target grid map corresponds to each first grid in the polar grid map, and accordingly, the category to which each second grid in the target grid map belongs is regarded as the category to which each corresponding first grid belongs.

Further, since the space points in the first grids belonging to the ground class in each first grid in the polar grid chart are ground points, the position area corresponding to the space points in the first grids belonging to the ground class in the point cloud data to be processed can be determined as the road drivable area.

In one embodiment, a method of determining a road drivable area is provided. The method may include the following steps (1) to (13).

(1) And acquiring point cloud data to be processed.

(2) And determining distance information and angle information of each spatial point of the point cloud data to be processed under a polar coordinate system based on the abscissa and the ordinate of each spatial point of the point cloud data to be processed.

(3) And projecting each space point of the point cloud data to be processed into an initial grid in which the distance coverage area in the initial polar coordinate grid map is matched with the distance information of the space point in the polar coordinate system and the angle coverage area is matched with the angle information of the space point in the polar coordinate system, so as to obtain the polar coordinate grid map.

(4) Converting the data arrangement form based on the polar coordinate raster pattern to obtain a target raster pattern; the target raster image includes second grids corresponding to the first grids in the polar coordinate raster image, each second grid including space points in the corresponding first grids, each second grid being arranged in an arrangement of image pixels.

(5) And determining target statistical parameters of the second grids based on the space points in the second grids respectively, and obtaining a statistical feature map corresponding to the target statistical parameters.

(6) And sequentially carrying out feature extraction on the statistical feature map through a first feature extraction network, a second feature extraction network, a third feature extraction network and a fourth feature extraction network in the preset convolutional neural network.

(7) And performing feature aggregation based on the output of the third feature extraction network and the output of the fourth feature extraction network through a first splicing network in the predetermined convolutional neural network.

(8) And performing feature aggregation based on the output of the first spliced network and the output of the second feature extraction network through a second spliced network in the predetermined convolutional neural network.

(9) And performing feature aggregation based on the output of the second spliced network and the output of the first feature extraction network through a third spliced network in the predetermined convolutional neural network.

(10) Obtaining a first semantic segmentation map and a second semantic segmentation map based on the output of a third splicing network through an output network in a preset convolutional neural network; the first semantic segmentation map comprises the confidence that each second grid belongs to the ground category, and the second semantic segmentation map comprises the confidence that each second grid belongs to the non-ground category.

(11) Based on the confidence that each second grid belongs to the ground category and the non-ground category, the category to which each second grid belongs is determined from the ground category and the non-ground category.

(12) And determining the category to which the corresponding first grid belongs based on the category to which each second grid belongs.

(13) And determining a position area corresponding to the space point in the first grid belonging to the ground class in the point cloud data to be processed as a road drivable area.

It should be noted that, in this embodiment, specific limitations on each technical feature may be the same as those of the foregoing limitations on the corresponding technical feature, which is not repeated herein.

It should be understood that, under reasonable conditions, although the steps in the flowcharts referred to in the foregoing embodiments are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, and the order of execution of the sub-steps or stages is not necessarily sequential, but may be performed in rotation or alternately with at least a portion of the sub-steps or stages of other steps or other steps.

In one embodiment, as shown in fig. 9, a road drivable area determining apparatus 900 is provided. The apparatus 900 may include the following modules 902 through 912.

The to-be-processed point cloud acquiring module 902 is configured to acquire to-be-processed point cloud data.

The polar grid diagram obtaining module 904 is configured to perform polar grid rasterization on the point cloud data to be processed to obtain a polar coordinate grid diagram.

An arrangement form conversion module 906, configured to perform data arrangement form conversion based on the polar coordinate raster pattern, so as to obtain a target raster pattern; the target raster image includes second grids corresponding to the first grids in the polar coordinate raster image, each second grid including space points in the corresponding first grids, each second grid being arranged in an arrangement of image pixels.

The statistical feature map obtaining module 908 is configured to determine a target statistical parameter of each second grid based on the spatial points in each second grid, and obtain a statistical feature map corresponding to the target statistical parameter.

The confidence determining module 910 is configured to perform semantic segmentation based on the statistical feature map through a predetermined convolutional neural network, so as to obtain confidence that each second grid belongs to a ground class and a non-ground class respectively.

The drivable region determining module 912 is configured to determine a road drivable region in the point cloud data to be processed based on the confidence that each second grid belongs to the ground class and the non-ground class, respectively.

The determining device 900 for the road drivable area performs polar coordinate rasterization on the point cloud data to be processed to obtain a polar coordinate raster image, performs data arrangement form conversion based on the polar coordinate raster image to obtain a target raster image, further determines target statistical parameters of each second raster based on space points in each second raster in the target raster image, obtains a statistical feature image corresponding to the target statistical parameters, performs semantic segmentation based on the statistical feature image through a predetermined convolutional neural network to obtain confidence that each second raster belongs to a ground category and a non-ground category respectively, and determines the road drivable area in the point cloud data to be processed based on the confidence that each second raster belongs to the ground category and the non-ground category respectively. Therefore, the distribution state of the polar grid graph is distributed in an arc shape, and the area covered by the area far from the pole is larger, so that the polar grid graph can truly describe the characteristics that the ground points of the point cloud data are distributed in an arc shape and the data are gradually sparse from near to far, and the accuracy of determining the road drivable area is improved. In addition, the predetermined convolutional neural network obtained through training based on mass sample point cloud data is subjected to semantic segmentation based on the statistical feature map, so that the method can be suitable for describing the point cloud data of various road environments, and the accuracy is further improved.

In one embodiment, the polar grid map acquisition module 904 may include the following: the polar coordinate information determining unit is used for determining distance information and angle information of each spatial point of the point cloud data to be processed under a polar coordinate system based on the abscissa and the ordinate of each spatial point of the point cloud data to be processed; and the projection unit is used for projecting each space point of the point cloud data to be processed into an initial grid in which the distance coverage area in the initial polar coordinate grid map is matched with the distance information of the space point in the polar coordinate system and the angle coverage area is matched with the angle information of the space point in the polar coordinate system, so as to obtain the polar coordinate grid map.

In one embodiment, the target statistical parameter comprises at least one of a total number of spatial points within the grid, an average height of spatial points within the grid, a standard deviation of spatial points within the grid, a maximum height of spatial points within the grid, a minimum height of spatial points within the grid, an average reflected intensity of spatial points within the grid.

In one embodiment, the confidence determination module 910 may include the following elements: the feature extraction unit is used for carrying out feature extraction on the statistical feature map through a first feature extraction network, a second feature extraction network, a third feature extraction network and a fourth feature extraction network in the preset convolutional neural network in sequence; a first aggregation unit, configured to perform feature aggregation based on an output of the third feature extraction network and an output of the fourth feature extraction network through a first splicing network in a predetermined convolutional neural network; the second aggregation unit is used for performing feature aggregation based on the output of the first splicing network and the output of the second feature extraction network through a second splicing network in the preset convolutional neural network; the third aggregation unit is used for performing feature aggregation based on the output of the second splicing network and the output of the first feature extraction network through a third splicing network in the preset convolutional neural network; the semantic segmentation map determining unit is used for obtaining a first semantic segmentation map and a second semantic segmentation map based on the output of the third splicing network through an output network in a preset convolutional neural network; the first semantic segmentation map comprises the confidence that each second grid belongs to a ground class, and the second semantic segmentation map comprises the confidence that each second grid belongs to a non-ground class.

In one embodiment, the apparatus 900 for determining a road drivable area may further include the following modules: the sample point cloud acquisition module is used for acquiring sample point cloud data carrying class labels; the class labels are used for representing classes to which each spatial point of the sample point cloud data belongs respectively; the sample polar grid diagram acquisition module is used for carrying out polar coordinate rasterization on the sample point cloud data to obtain a sample polar grid diagram; the sample target raster image acquisition module is used for carrying out data arrangement form conversion based on the sample polar coordinate raster image to obtain a sample target raster image; the sample target grid graph comprises sample second grids corresponding to the sample first grids in the sample polar coordinate grid graph respectively, each sample second grid comprises space points in the corresponding sample first grids, and each sample second grid is arranged in an arrangement mode of image pixels; the sample statistical feature map acquisition module is used for determining target statistical parameters of the second grids of each sample based on space points in the second grids of each sample respectively and obtaining a sample statistical feature map corresponding to the target statistical parameters; the sample confidence determining module is used for carrying out semantic segmentation based on a sample statistical feature map through a convolutional neural network to be trained and determining the confidence that each sample second grid respectively belongs to a ground category and a non-ground category; the loss parameter determining module is used for determining loss parameters based on confidence and category labels of the second grids of the samples belonging to the ground category and the non-ground category respectively; the model training module is used for carrying out model training on the convolutional neural network to be trained based on the loss parameters and determining a preset convolutional neural network.

In one embodiment, the apparatus 900 for determining a road drivable area may further include the following modules: the rectangular coordinate rasterization module is used for performing rectangular coordinate rasterization on the sample point cloud data to obtain a rectangular coordinate raster pattern; the maximum height determining module is used for determining the maximum height of the space points in each third grid based on the vertical coordinates of the space points in each third grid in the rectangular coordinate grid diagram; the ground category grid determining module is used for taking the ground category as the category to which the space point in the third grid with the maximum height of the space point being smaller than the height threshold value belongs; a non-ground category grid determining module, configured to use a non-ground category as a category to which a space point in a third grid where a maximum height of the space point is equal to or greater than a height threshold belongs; and the category label determining module is used for determining the category label carried by the sample point cloud data based on the category to which the space point in each third grid belongs.

In one embodiment, the travelable region determination module 912 may include the following: a second grid category determining unit configured to determine, from among the ground category and the non-ground category, a category to which each second grid belongs, based on a confidence that each second grid belongs to the ground category and the non-ground category, respectively; a first grid category determining unit, configured to determine, based on the category to which each second grid belongs, the category to which each corresponding first grid belongs; and the drivable area determining unit is used for determining a position area corresponding to the space point in the first grid belonging to the ground category in the point cloud data to be processed as a road drivable area.

It should be noted that, for the specific limitation of the determination device 900 for the road drivable area, reference may be made to the limitation of the determination method for the road drivable area hereinabove, and the description thereof will not be repeated here. The respective modules in the above-described road drivable region determination apparatus 900 may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, including a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps in the method for determining a road drivable area provided in any of the embodiments of the present application.

In particular, the computer device may be the control terminal 110 in fig. 1 or fig. 2. As shown in fig. 10, the computer device includes a processor, a memory, a network interface, an input device, and a display screen connected by a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system, and may also store a computer program that, when executed by a processor, causes the processor to implement the method for determining a road drivable area provided by any of the embodiments of the present application. The internal memory may also store a computer program that, when executed by the processor, causes the processor to perform the method for determining a road drivable area according to any of the embodiments of the present application. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

Alternatively, the computer device may be the server 130 of FIG. 2. As shown in fig. 11, the computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor is configured to provide computing and control capabilities. The memory includes a non-volatile storage medium storing an operating system, a computer program, and an internal memory providing an environment for the operating system and the computer program in the non-volatile storage medium to run. The network interface is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements the method for determining a road drivable area provided by any of the embodiments of the present application.

It will be appreciated by those skilled in the art that the structures shown in fig. 10 and 11 are merely block diagrams of portions of structures associated with aspects of the application and are not intended to limit the computer device to which aspects of the application may be applied, and that a particular computer device may include more or fewer components than those shown, or may combine certain components, or may have a different arrangement of components.

In one embodiment, the road drivable area determining apparatus 900 provided in the present application may be implemented in the form of a computer program, which may be executed on a computer device as shown in fig. 10 or 11. The memory of the computer device may store various program modules constituting the determination apparatus 900 for a road drivable region, such as the point cloud to be processed acquisition module 902, the polar grid image acquisition module 904, the arrangement form conversion module 906, the statistical feature image acquisition module 908, the confidence determination module 910, and the drivable region determination module 912 shown in fig. 9. The computer program constituted by the respective program modules causes the processor to execute the steps in the road drivable region determination method of the respective embodiments of the present application described in the present specification.

For example, the computer device shown in fig. 10 or 11 may perform step S302 by the point cloud to be processed acquisition module 902 in the road drivable region determination apparatus 900 shown in fig. 9, step S304 by the polar grid map acquisition module 904, step S306 by the arrangement form conversion module 906, and the like.

Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

Accordingly, in one embodiment, a computer readable storage medium is provided, storing a computer program which, when executed by a processor, causes the processor to perform the steps of the above-described road drivable region determination method. The step of the method of determining a road drivable region here may be a step in the method of determining a road drivable region in the respective embodiments described above.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. A method of determining a road drivable area, comprising:

acquiring point cloud data to be processed;

performing polar coordinate rasterization on the point cloud data to be processed to obtain a polar coordinate raster pattern, wherein the distribution state of the polar coordinate raster pattern is distributed in an arc shape, and the area covered by the area far from the pole in the polar coordinate raster pattern is larger;

2. The method of claim 1, wherein performing polar rasterization on the point cloud data to be processed to obtain a polar raster pattern comprises:

determining distance information and angle information of each space point of the point cloud data to be processed under a polar coordinate system based on the abscissa and the ordinate of each space point of the point cloud data to be processed;

and projecting each space point of the point cloud data to be processed into an initial grid in which the distance coverage area in the initial polar coordinate grid map is matched with the distance information of the space point in the polar coordinate system and the angle coverage area is matched with the angle information of the space point in the polar coordinate system, so as to obtain the polar coordinate grid map.

3. The method of claim 1, wherein the target statistical parameter comprises at least one of a total number of intra-grid spatial points, an average height of intra-grid spatial points, a standard deviation of intra-grid spatial points, a maximum height of intra-grid spatial points, a minimum height of intra-grid spatial points, and an average reflected intensity of intra-grid spatial points.

4. The method of claim 1, wherein determining the confidence that each of the second grids belongs to a ground class and a non-ground class, respectively, based on semantic segmentation of the statistical feature map by a predetermined convolutional neural network, comprises:

Sequentially carrying out feature extraction on the statistical feature map through a first feature extraction network, a second feature extraction network, a third feature extraction network and a fourth feature extraction network in a preset convolutional neural network;

performing feature aggregation based on the output of the third feature extraction network and the output of the fourth feature extraction network through a first splicing network in the predetermined convolutional neural network;

performing feature aggregation based on the output of the first spliced network and the output of a second feature extraction network through a second spliced network in the predetermined convolutional neural network;

performing feature aggregation based on the output of the second spliced network and the output of the first feature extraction network through a third spliced network in the predetermined convolutional neural network;

obtaining a first semantic segmentation map and a second semantic segmentation map based on the output of the third splicing network through an output network in the predetermined convolutional neural network;

the first semantic segmentation map comprises the confidence that each second grid belongs to a ground category, and the second semantic segmentation map comprises the confidence that each second grid belongs to a non-ground category.

5. The method of claim 1, wherein determining the manner of the predetermined convolutional neural network comprises:

Acquiring sample point cloud data carrying class labels; the category label is used for representing categories to which each spatial point of the sample point cloud data belongs respectively;

performing polar coordinate rasterization on the sample point cloud data to obtain a sample polar coordinate raster image;

performing data arrangement form conversion based on the sample polar coordinate grid graph to obtain a sample target grid graph; the sample target grid graph comprises sample second grids corresponding to the sample first grids in the sample polar coordinate grid graph respectively, each sample second grid comprises space points in the corresponding sample first grid, and each sample second grid is arranged in an arrangement mode of image pixels;

determining target statistical parameters of each sample second grid based on space points in each sample second grid respectively, and obtaining a sample statistical feature map corresponding to the target statistical parameters;

carrying out semantic segmentation on the basis of the sample statistical feature map through a convolutional neural network to be trained, and determining the confidence that each sample second grid respectively belongs to a ground category and a non-ground category;

determining a loss parameter based on the confidence that each sample second grid belongs to a ground category and a non-ground category respectively and the category label;

And carrying out model training on the convolutional neural network to be trained based on the loss parameters, and determining the predetermined convolutional neural network.

6. The method of claim 5, wherein determining the manner in which the class label is carried by the sample point cloud data comprises:

performing rectangular coordinate rasterization on the sample point cloud data to obtain a rectangular coordinate raster image;

determining the maximum height of the space point in each third grid based on the vertical coordinates of the space point in each third grid in the rectangular coordinate grid diagram;

taking the ground class as the class to which the space point in the third grid with the maximum height of the space point smaller than the height threshold belongs;

taking the non-ground class as a class to which the space point in the third grid with the maximum height of the space point being equal to or greater than the height threshold belongs;

and determining class labels carried by the sample point cloud data based on the classes to which the space points in the third grids belong.

7. The method of claim 1, wherein the determining the road drivable region in the point cloud data to be processed based on the confidence that each of the second grids belongs to a ground class and a non-ground class, respectively, comprises:

Based on the confidence that each second grid belongs to the ground category and the non-ground category, determining the category to which each second grid belongs from the ground category and the non-ground category;

determining the category of the first grid corresponding to each second grid based on the category of each second grid;

and determining a position area corresponding to the space point in the first grid belonging to the ground class in the point cloud data to be processed as a road drivable area.

8. A road drivable area determining apparatus comprising:

the polar grid diagram acquisition module is used for carrying out polar grid rasterization on the point cloud data to be processed to obtain a polar grid diagram, the distribution state of the polar grid diagram is distributed in an arc shape, and the area covered by the area far from the pole in the polar grid diagram is larger;

9. A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method of any one of claims 1 to 7.

10. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the method of any of claims 1 to 7.