WO2022017133A1

WO2022017133A1 - Method and apparatus for processing point cloud data

Info

Publication number: WO2022017133A1
Application number: PCT/CN2021/102856
Authority: WO
Inventors: 王哲; 石建萍
Original assignee: 商汤集团有限公司
Priority date: 2020-07-22
Filing date: 2021-06-28
Publication date: 2022-01-27
Also published as: JP2022547873A; KR20220044777A; CN113971694A

Abstract

Provided in the present disclosure are a method and apparatus for processing point cloud data, the method comprising: acquiring point cloud data to be processed, which is obtained by a radar apparatus scanning in a target scenario; according to effective sensing range information corresponding to the target scenario, filtering out target point cloud data from the point cloud data to be processed; and detecting the target point cloud data, and obtaining a detection result.

Description

Method and device for processing point cloud data

CROSS-REFERENCE TO RELATED APPLICATIONS

This patent application claims the priority of the Chinese patent application filed on July 22, 2020 with the application number of 202010713989.6 and the invention titled "a point cloud data processing method and device", which is incorporated herein by reference middle.

technical field

The present disclosure relates to the technical field of information processing, and in particular, to a method and device for processing point cloud data.

Background technique

With the development of science and technology, LiDAR is widely used in the fields of automatic driving, UAV exploration, map mapping and other fields with its precise ranging ability. Taking autonomous driving as an example, in the application scenario of autonomous driving, the point cloud data collected by lidar is generally processed to realize the positioning of the vehicle and the identification of obstacles. It consumes a lot of computing resources. However, due to the limited computing resources of electronic devices that process point cloud data, and not all point cloud data are useful for vehicle positioning and obstacle recognition, so the calculation method of this calculation method Low efficiency and low utilization of computing resources.

SUMMARY OF THE INVENTION

The embodiments of the present disclosure provide at least one point cloud data processing method and device.

In a first aspect, an embodiment of the present disclosure provides a point cloud data processing method, including: acquiring point cloud data to be processed obtained by scanning a radar device in a target scene; The target point cloud data is screened out from the point cloud data to be processed; the target point cloud data is detected to obtain a detection result.

Based on the above method, the point cloud data to be processed collected by the radar device in the target scene can be screened based on the effective perception range information corresponding to the target scene, and the screened target point cloud data is the target point cloud data corresponding to the target scene Therefore, based on the filtered point cloud data, the detection calculation is performed in the target scene, which can reduce the amount of calculation, improve the calculation efficiency, and the utilization rate of computing resources in the target scene.

In a possible implementation manner, the effective perception range information corresponding to the target scene is determined according to the following methods: obtaining computing resource information of a processing device; based on the computing resource information, determining the Effective sensing range information.

In this way, different effective perception range information can be determined for different electronic devices processing the point cloud data to be processed in the same target scene, so that it can be adapted to different electronic devices.

In a possible implementation manner, according to the effective sensing range information corresponding to the target scene, filtering out the target point cloud data from the to-be-processed point cloud data includes: determining an effective coordinate range based on the effective sensing range information. ; Based on the effective coordinate range, filter out the target point cloud data from the to-be-processed point cloud data.

In a possible implementation manner, the determining the effective coordinate range based on the effective sensing range information includes: based on the position information of the reference position point within the effective sensing range, and the reference position point within the target The coordinate information in the scene determines the valid coordinate range corresponding to the target scene.

In a possible implementation manner, the filtering out target point cloud data from the point cloud data to be processed based on the effective coordinate range includes: scanning a radar with the corresponding coordinate information located within the effective coordinate range. Points are used as radar scanning points in the target point cloud data.

In a possible implementation manner, the coordinate information of the reference position point in the target scene is determined according to the following methods: obtaining the position information of the intelligent driving device on which the radar device is set; based on the position information of the intelligent driving device Determine the road type of the road where the intelligent driving device is located; and obtain the coordinate information of the reference position point matching the road type as the coordinate information of the reference position point in the target scene.

Here, the point cloud data that the intelligent driving device needs to process may be different when it is located on roads of different road types. Therefore, by obtaining the coordinate information of the reference position point matching the road type, the intelligent driving device can determine the current The valid coordinate range of the road type where it is located, so as to filter out the point cloud data under the corresponding road type, thereby improving the accuracy of the detection results of the intelligent driving device under different road types.

In a possible implementation manner, the detection result includes the position of the object to be identified in the target scene; the detecting the target point cloud data to obtain the detection result includes: performing the detection on the target point cloud data Perform rasterization processing to obtain a grid matrix; the value of each element in the grid matrix is used to represent whether there is a target point at the corresponding grid; according to the grid matrix and the to-be-identified in the target scene The size information of the object is used to generate a sparse matrix corresponding to the object to be recognized; based on the generated sparse matrix, the position of the object to be recognized in the target scene is determined.

In a possible implementation manner, the generating a sparse matrix corresponding to the to-be-identified object according to the grid matrix and the size information of the to-be-identified object in the target scene includes: according to the grid matrix and the size information of the object to be identified in the target scene, perform at least one expansion processing operation or erosion processing operation on the target element in the grid matrix, and generate a sparse matrix corresponding to the object to be identified; wherein, the The value of the target element indicates that the target point exists at the corresponding grid.

In a possible implementation manner, the expansion processing operation or the erosion processing operation includes shift processing and logical operation processing, and the difference between the coordinate range of the sparse matrix and the size of the object to be identified is within a preset threshold. within the range.

In a possible implementation manner, according to the grid matrix and the size information of the object to be identified in the target scene, at least one expansion processing operation is performed on the elements in the grid matrix to generate a The sparse matrix corresponding to the object includes: performing a first inversion operation on the elements in the grid matrix before the current expansion processing operation to obtain the grid matrix after the first inversion operation; The grid matrix after the first inversion operation is subjected to at least one convolution operation to obtain a grid matrix with a preset sparsity after at least one convolution operation; the preset sparsity is determined by the target scene. The size information of the identification object is determined; the second inversion operation is performed on the elements in the grid matrix with the preset sparsity after the at least one convolution operation to obtain the sparse matrix.

In a possible implementation manner, performing the first inversion operation on the elements in the grid matrix before the current expansion processing operation to obtain the grid matrix after the first inversion operation includes: based on the second preset A convolution kernel, which performs a convolution operation on other elements except the target element in the grid matrix before the current expansion processing operation, to obtain the first inversion element, and based on the second preset convolution kernel, performs a convolution operation on the current time The target element in the grid matrix before the expansion processing operation is subjected to a convolution operation to obtain a second inversion element; based on the first inversion element and the second inversion element, the grid after the first inversion operation is obtained. lattice matrix.

In a possible implementation manner, at least one convolution operation is performed on the grid matrix after the first inversion operation based on the first preset convolution check, to obtain at least one convolution operation with a preset sparsity. The grid matrix includes: for the first convolution operation, performing a convolution operation on the grid matrix after the first inversion operation and the first preset convolution kernel to obtain the grid after the first convolution operation. matrix; repeat the steps of performing the convolution operation on the grid matrix after the previous convolution operation and the first preset convolution kernel to obtain the grid matrix after the current convolution operation, until obtaining the grid matrix with the pre-set convolution kernel. Set the raster matrix of sparsity.

In a possible implementation manner, the first preset convolution kernel has a weight matrix and an offset corresponding to the weight matrix; for the first convolution operation, the grid after the first inversion operation is Performing a convolution operation on the lattice matrix and the first preset convolution kernel to obtain a lattice matrix after the first convolution operation, including: for the first convolution operation, according to the size of the first preset convolution kernel and the preset steps length, select each grid sub-matrix from the grid matrix after the first inversion operation; for each selected grid sub-matrix, multiply the grid sub-matrix and the weight matrix operation to obtain a first operation result, and adding the first operation result and the offset to obtain a second operation result; based on the second operation result corresponding to each of the grid sub-matrixes, determine the first volume The grid matrix after the product operation.

In a possible implementation manner, according to the grid matrix and the size information of the object to be identified in the target scene, at least one corrosion processing operation is performed on the elements in the grid matrix to generate a The sparse matrix corresponding to the object includes: performing at least one convolution operation on the grid matrix to be processed based on the third preset convolution kernel to obtain a grid matrix with a preset sparsity after at least one convolution operation; Suppose the sparsity is determined by the size information of the object to be identified in the target scene; the grid matrix with the preset sparsity after the at least one convolution operation is determined as the sparseness corresponding to the object to be identified matrix.

In a possible implementation manner, performing grid processing on the target point cloud data to obtain a grid matrix includes: performing grid processing on the target point cloud data to obtain a grid matrix and the grid matrix. The correspondence between each element and the coordinate range information of each target point; the determining the position range of the object to be identified in the target scene based on the generated sparse matrix includes: based on the grid matrix The corresponding relationship between each element in the sparse matrix and the coordinate range information of each target point, determine the coordinate information of the target point corresponding to each target element in the generated sparse matrix; The coordinate information of the target point is combined to determine the position of the object to be recognized in the target scene.

In a possible implementation manner, the determining the position of the object to be identified in the target scene based on the generated sparse matrix includes: pairing the generated sparse matrix based on a trained convolutional neural network. Perform at least one convolution process on each target element in to obtain a convolution result; based on the convolution result, determine the position of the object to be identified in the target scene.

In a possible implementation manner, after the target point cloud data is detected and a detection result is obtained, the method further includes: controlling and setting an intelligent driving device of the radar device based on the detection result.

In a second aspect, an embodiment of the present disclosure further provides a point cloud data processing device, including: an acquisition module for acquiring point cloud data to be processed obtained by scanning a radar device in a target scene; a screening module for according to the target The effective perception range information corresponding to the scene is used to filter out the target point cloud data from the to-be-processed point cloud data; the detection module is used to detect the target point cloud data to obtain a detection result.

In a third aspect, embodiments of the present disclosure further provide a computer device, including a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the computer device runs, the processor It communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the above-mentioned first aspect or the steps in any possible implementation manner of the first aspect are performed.

In a fourth aspect, an embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program stored on the computer program is executed by a processor to execute the steps in the first aspect or any possible implementation manner of the first aspect .

For a description of the effects of the above point cloud data processing apparatus, computer equipment, and computer-readable storage medium, reference may be made to the description of the above point cloud data processing method, which will not be repeated here.

In order to make the above-mentioned objects, features and advantages of the present disclosure more obvious and easy to understand, the preferred embodiments are exemplified below, and are described in detail as follows in conjunction with the accompanying drawings.

Description of drawings

FIG. 1 shows a flowchart of a point cloud data processing method provided by an embodiment of the present disclosure;

FIG. 2 shows a schematic diagram of coordinates of each position point of a cuboid provided by an embodiment of the present disclosure;

FIG. 3 shows a flowchart of a method for determining coordinate information of the reference position point provided by an embodiment of the present disclosure;

4 shows a flowchart of a method for determining a detection result provided by an embodiment of the present disclosure;

5A shows a schematic diagram of a grid matrix before encoding provided by an embodiment of the present disclosure;

FIG. 5B shows a schematic diagram of a sparse matrix provided by an embodiment of the present disclosure;

5C shows a schematic diagram of an encoded grid matrix provided by an embodiment of the present disclosure;

FIG. 6A shows a schematic diagram of a left-shifted grid matrix provided by an embodiment of the present disclosure;

FIG. 6B shows a schematic diagram of a logical OR operation provided by an embodiment of the present disclosure;

FIG. 7A shows a schematic diagram of a grid matrix after a first inversion operation provided by an embodiment of the present disclosure;

7B shows a schematic diagram of a grid matrix after a convolution operation provided by an embodiment of the present disclosure;

FIG. 8 shows a schematic diagram of the architecture of a point cloud data processing apparatus provided by an embodiment of the present disclosure;

FIG. 9 shows a schematic structural diagram of a computer device provided by an embodiment of the present disclosure.

detailed description

In order to make the purposes, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only These are some, but not all, embodiments of the present disclosure. The components of the disclosed embodiments generally described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations. Therefore, the following detailed description of the embodiments of the disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure as claimed, but is merely representative of selected embodiments of the disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative work fall within the protection scope of the present disclosure.

In the related art, when processing point cloud data, it generally consumes a lot of computing resources, but not all the point cloud data collected are useful for the required calculation results, and some unnecessary point cloud data are involved in the calculation. In the process, it leads to a waste of computing resources.

Based on this, the present disclosure provides a point cloud data processing method and device, which can screen the point cloud data to be processed collected by the radar device in the target scene based on the effective perception range information corresponding to the target scene, and the screened target point Cloud data is the point cloud data that is valid in the target scene. Therefore, based on the filtered target point cloud data, the detection calculation is performed in the target scene, which can reduce the amount of calculation, improve the calculation efficiency, and the utilization of computing resources in the target scene. Rate.

The defects existing in the above solutions are all the results obtained by the inventor after practice and careful research. Therefore, the discovery process of the above problems and the solutions to the above problems proposed by the present disclosure hereinafter should be the inventors Contributions made to this disclosure during the course of this disclosure.

It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

In order to facilitate the understanding of this embodiment, a point cloud data processing method disclosed in the embodiment of the present disclosure is first introduced in detail. The execution subject of the point cloud data processing method provided by the embodiment of the present disclosure is generally a computer with a certain computing capability. equipment, the computer equipment for example includes: terminal equipment or server or other processing equipment, the terminal equipment can be user equipment (User Equipment, UE), mobile equipment, user terminal, terminal, personal digital processing (Personal Digital Assistant, PDA), computing equipment, vehicle equipment, etc. In some possible implementations, the point cloud data processing method may be implemented by a processor calling computer-readable instructions stored in a memory.

Referring to FIG. 1, an embodiment of the present disclosure provides a point cloud data processing method, the method includes steps 101 to 103, wherein:

Step 101: Obtain point cloud data to be processed obtained by scanning the radar device in the target scene.

Step 102: Screen out target point cloud data from the to-be-processed point cloud data according to the effective perception range information corresponding to the target scene.

Step 103: Detect the target point cloud data to obtain a detection result.

The following is a detailed introduction to the above steps 101 to 103 .

The radar device can be deployed on an intelligent driving device, and during the driving process of the intelligent driving device, the radar device can scan to obtain point cloud data to be processed.

The effective sensing range information may include coordinate thresholds on each coordinate dimension in a reference coordinate system, where the reference coordinate system is a three-dimensional coordinate system.

Exemplarily, the effective perception range information may be description information that constitutes a cuboid. For example, the description information may be the coordinate thresholds of the length, width, and height of the cuboid in each coordinate dimension in the reference coordinate system, including the x-axis direction. The maximum value x_max and the minimum value x_min, the maximum value y_max and the minimum value y_min in the y-axis direction, and the maximum value z_max and the minimum value z_min in the z-axis direction.

Exemplarily, FIG. 2 shows the structure based on the maximum value x_max and the minimum value x_min in the x-axis direction, the maximum value y_max and the minimum value y_min in the y-axis direction, and the maximum value z_max and the minimum value z_min in the z-axis direction. The coordinates of each position point of the cuboid, the coordinate origin can be the lower left vertex of the cuboid, and its coordinate value is (x_min, y_min, z_min).

In another possible implementation, the effective sensing range information may also be description information of a sphere, a cube, etc. For example, only the radius of a sphere or the length, width, and height of a cube is given. The specific effective sensing range information can be based on The actual application scenario is described, and the present disclosure is not limited.

In the specific implementation, since the scanning range of the radar device is limited, for example, the farthest scanning distance is 200 meters, therefore, in order to ensure the constraints of the effective sensing range on the point cloud data to be processed, the constraints on the effective sensing range can be preset. , exemplarily, the values of x_max, y_max, and z_max can be set to be less than or equal to 200 meters.

In a possible application scenario, the calculation based on the point cloud data is based on the operation of the spatial voxels corresponding to the point cloud data, such as the layer-by-layer learning network (VoxelNet) based on the three-dimensional spatial information of the point cloud. Therefore, in this In this application scenario, in the case of limiting the coordinate thresholds of the reference radar scanning point in each coordinate dimension in the reference coordinate system, it is also possible to limit the number of spatial voxels of the reference radar scanning point in each coordinate dimension not to exceed the space volume pixel threshold.

Exemplarily, the number of spatial voxels in each coordinate dimension can be calculated by the following formula:

N_x=(x_max–x_min)/x_gridsize;

N_y=(y_max–y_min)/y_gridsize;

N_z=(z_max−z_min)/z_gridsize.

Among them, x_gridsize, y_gridsize, z_gridsize respectively represent the preset resolutions corresponding to each dimension, N_x represents the number of spatial voxels in the x-axis direction, N_y represents the number of spatial voxels in the y-axis direction, and N_z represents the z-axis direction. The number of spatial voxels on .

In another possible application scenario, the calculation based on the point cloud data may also be an algorithm based on the point cloud data within the area of the top view, such as the point cloud-based fast target detection framework (PointPillars). Limit the top-view voxel area, for example, you can limit the value of N_x*N_y.

In a possible implementation manner, when determining the effective sensing range information corresponding to the target scene, the effective sensing range information obtained in advance based on experiments may be obtained, and the effective sensing range information may be used as a preset and A fixed value, and the limited perceptual range information also obeys the above constraints.

In another possible implementation manner, when determining the effective sensing range information corresponding to the target scene, the computing resource information of the processing device may also be obtained first; The effective perception range information of .

The computing resource information includes at least one of the following information: the memory of the central processing unit (CPU), the video memory of the graphics processing unit (GPU), and the computing resources of the field programmable logic gate array (FPGA).

Specifically, when determining the effective sensing range information matched with the computing resource information based on the computing resource information, the corresponding relationship between the computing resource information at each level and the effective sensing range information can be preset, and then when the method provided by the present disclosure is applied For different electronic devices, the effective sensing range information that matches the computing resource information of the electronic device can be searched based on the comparison relationship, or, when it is detected that the computing resource information of the electronic device changes, the effective sensing range can be dynamically adjusted. range information.

Taking the computing resource information including the memory of the central processing unit CPU as an example, the corresponding relationship between the computing resource information of each level and the effective sensing range information can be shown in Table 1 below:

Table 1

Wherein, the correspondence between the computing resource information of each level and the effective sensing range information may be obtained through an experimental test in advance.

In this way, different effective sensing range information can be determined for different electronic devices processing the point cloud data to be processed in the same target scene, so that it can be adapted to different electronic devices.

In a possible implementation, when selecting the target point cloud data from the point cloud data to be processed according to the effective sensing range information corresponding to the target scene, the effective coordinate range may be determined based on the effective sensing range information first, and then based on the effective sensing range information. Valid coordinate range, filter out the target point cloud data from the point cloud data to be processed.

Here, two situations may be included: both the effective sensing range information and the effective coordinate range are fixed; the effective coordinate range can be changed according to the change of the effective sensing range information.

For the first case, for example, the effective sensing range information may be the description information of the cuboid, including the length, width and height of the cuboid, and the radar device is used as the intersection of the body diagonals of the cuboid. If the position does not change, the cuboid is fixed, and the coordinate range in the cuboid is the valid coordinate range, so the valid coordinate range is also fixed.

For the second case, when determining the effective coordinate range based on the effective sensing range information, the position information of the reference position point within the effective sensing range and the coordinate information of the reference position point in the target scene may be used to determine the effective coordinate range. , and determine the effective coordinate range corresponding to the target scene.

Exemplarily, the effective sensing range information may be the description information of the cuboid, and the reference position point may be the intersection of the body diagonals of the cuboid. With the change of the reference position point, the effective sensing range information will also be available in different target scenarios. changes, so the corresponding valid coordinate range also changes.

Wherein, the coordinate information of the reference position point in the target scene may be the coordinate information of the reference position point in the radar coordinate system corresponding to the target scene, and the radar coordinate system may be used for collecting point cloud data in the target scene. The three-dimensional coordinate system established by the radar device for the coordinate origin.

If the effective sensing range information is the description information of a cuboid, the reference position point may be the intersection of the body diagonals of the cuboid. If the effective sensing range information is the description information of a sphere, the reference position point may be the center of the sphere. , or, the reference position point can be any reference radar scanning point within the effective sensing range information.

In a specific implementation, when determining the effective coordinate range corresponding to the target scene based on the position information of the reference position point within the effective perception range and the coordinate information of the reference position point in the target scene, the effective coordinate range corresponding to the target scene may be determined based on The coordinate information of the reference position point in the radar coordinate system, convert the coordinate thresholds in each coordinate dimension in the effective sensing range information in the reference coordinate system into the coordinates in each coordinate dimension in the radar coordinate system threshold.

Specifically, the reference position point may have corresponding first coordinate information in the reference coordinate system, and may have corresponding second coordinate information in the radar coordinate system. Based on the first coordinate information and the second coordinate information of the reference position point, the Determine the conversion relationship between the reference coordinate system and the radar coordinate system. Based on the conversion relationship, the coordinate thresholds of the reference radar scanning points in the effective sensing range information in each coordinate dimension under the reference coordinate system can be converted into the coordinate thresholds in the reference coordinate system. Coordinate thresholds in each coordinate dimension in the radar coordinate system.

In another possible implementation, the relative positional relationship between the threshold coordinate point corresponding to the coordinate threshold of each coordinate dimension of the reference radar scanning point in the effective sensing range information in the reference coordinate system and the reference position point may be determined first. , and then, based on the relative positional relationship, determine the coordinate thresholds of the reference radar scanning points in the effective sensing range information in each coordinate dimension in the reference coordinate system and the coordinate thresholds in each coordinate dimension in the radar coordinate system.

Here, when the coordinate information of the reference position point changes, the coordinate thresholds of the reference radar scanning point in each coordinate dimension in the radar coordinate system in the effective sensing range information determined based on the coordinate information of the reference position point will also change accordingly. That is, the effective coordinate range corresponding to the target scene will also change, so it is possible to control the effective coordinate range in different target scenes by controlling the coordinate information of the reference position point.

In a possible implementation manner, when the target point cloud data is filtered out from the point cloud data to be processed based on the effective coordinate range, the radar scanning with the corresponding coordinate information located within the effective coordinate range may be performed. Points are used as radar scanning points in the target point cloud data.

Specifically, when the radar scan point is stored, the three-dimensional coordinate information of the radar scan point can be stored, and then based on the three-dimensional coordinate information of the radar scan point, it can be determined whether the radar scan point is within the effective coordinate range.

Exemplarily, if the three-dimensional coordinate information of the radar scanning point is (x, y, z), when judging whether the radar scanning point is a radar scanning point in the target point cloud data, the three-dimensional coordinate information of the radar scanning point can be determined. Whether the coordinate information meets the following conditions:

x_min<x<x_max and y_min<y<y_max and z_min<z<z_max.

In the following, the application of the above point cloud data processing method will be introduced in combination with specific application scenarios. In a possible implementation, the above-mentioned point cloud data processing method can be applied to an automatic driving scene.

In a possible application scenario, the intelligent driving device is provided with a radar device. When determining the coordinate information of the reference position point, the coordinate information of the reference position point can be determined by the method as shown in FIG. 3 , and the method includes: The following steps 301 to 303.

Step 301: Acquire location information of the intelligent driving device on which the radar device is set.

When acquiring the location information of the intelligent driving device, for example, it can be acquired based on a Global Positioning System (Global Positioning System, GPS), and the present disclosure does not limit other ways in which the location information of the intelligent driving device can be acquired.

Step 302: Determine the road type of the road where the smart driving device is located based on the location information of the smart driving device.

In specific implementation, the road type of each road within the drivable range of the intelligent driving device may be preset, and the road type may include, for example, an intersection, a T-junction, a highway, a parking lot, etc., based on the location information of the intelligent driving device The road on which the intelligent driving device is located may be determined, and then the road type of the road where the intelligent driving device is located may be determined according to the preset road type of each road within the drivable range of the intelligent driving device.

Step 303: Acquire coordinate information of a reference position point matching the road type.

The location of the point cloud data that needs to be focused on processing may be different for different road types. For example, if the intelligent driving device is located on a highway, the point cloud data that the intelligent driving device needs to process may be the point cloud data in front of the intelligent driving device. When the device is located at an intersection, the point cloud data that the intelligent driving device needs to process may be the point cloud data around the intelligent driving device. Screening of point cloud data under road type.

Here, the point cloud data that the intelligent driving device needs to process may be different when it is located on roads of different road types. Therefore, by obtaining the coordinate information of the reference position point matching the road type, the intelligent driving device can determine the current The valid coordinate range of the road type where it is located, so as to filter out the point cloud data under the corresponding road type, thereby improving the accuracy of screening point cloud data.

In a possible implementation, after the target point cloud data is screened out from the point cloud data to be processed, the target point cloud data can also be detected, and after the detection result is obtained, based on the detection result, the control settings Intelligent driving equipment for radar installations.

Exemplarily, after filtering out the target point cloud data, the detection of the object to be recognized (for example, an obstacle) during the driving process of the intelligent driving device can be realized based on the filtered target point cloud data. Set the driving of the intelligent driving equipment of the radar unit.

Controlling the driving of the intelligent driving device may be controlling the acceleration, deceleration, steering, braking, and the like of the intelligent driving device.

With respect to step 103, in a possible implementation manner, the detection result includes the position of the object to be identified in the target scene. The process of detecting the target point cloud data will be described in detail below with reference to specific embodiments, as shown in FIG. 4 , An embodiment of the present disclosure provides a method for determining a detection result, which includes the following steps:

Step 401: Perform grid processing on the target point cloud data to obtain a grid matrix; the value of each element in the grid matrix is used to represent whether there is a target point at the corresponding grid. Here, the point corresponding to the target point cloud data is called a target point.

Step 402: Generate a sparse matrix corresponding to the to-be-identified object according to the grid matrix and the size information of the to-be-identified object in the target scene.

Step 403: Determine the position of the object to be identified in the target scene based on the generated sparse matrix.

In the embodiment of the present disclosure, for the target point cloud data, rasterization may be performed first, and then the raster matrix obtained by the rasterization may be sparsely processed to generate a sparse matrix. The rasterization process here can be to map the spatially distributed target point cloud data including each target point into a set grid, and perform grid coding based on the target points corresponding to the grid (corresponding to a zero-one matrix ) process, the sparse processing process may be based on the size information of the object to be identified in the target scene to perform expansion processing operation on the above zero-one matrix (corresponding to the processing result of increasing the elements indicated as 1 in the zero-one matrix) or erosion processing The process of the operation (corresponding to the processing result of reducing the elements indicated as 1 in the zero-one matrix). Next, the above-mentioned rasterization process and thinning process will be further described.

Wherein, in the process of the rasterization processing, the target points distributed in the Cartesian continuous real number coordinate system may be converted into the rasterized discrete coordinate system.

In order to facilitate the understanding of the above-mentioned rasterization processing process, a specific description may be given below with reference to an example. The embodiment of the present disclosure has target points such as point A (0.32m, 0.48m), point B (0.6m, 0.4801m), and point C (2.1m, 3.2m), and rasterization is performed with 1m as the grid width, The range from (0m,0m) to (1m,1m) corresponds to the first grid, the range from (0m,1m) to (1m,2m) corresponds to the second grid, and so on. The gridded A'(0,0) and B'(0,0) are in the grid of the first row and the first column, and C'(2,3) can be in the grid of the second row and the third column. Gerry, thus realizing the conversion from the Cartesian continuous real coordinate system to the discrete coordinate system. Wherein, the coordinate information about the target point may be determined by the reference reference point (for example, the location of the radar device that collects the point cloud data), which will not be repeated here.

In the embodiment of the present disclosure, two-dimensional rasterization can be performed, and three-dimensional rasterization can also be performed. Compared with the two-dimensional rasterization, the three-dimensional rasterization adds height information on the basis of the two-dimensional rasterization. Next, a detailed description can be made by taking two-dimensional rasterization as an example.

For two-dimensional rasterization, the limited space can be divided into N*M grids, which are generally divided at equal intervals, and the interval size can be configured. At this time, a zero-one matrix (ie, the above grid matrix) can be used to encode the rasterized target point cloud data. Each grid can be represented by coordinates consisting of a unique row number and column number. and the above target point, the grid is encoded as 1, otherwise it is 0, so that the encoded zero-one matrix can be obtained.

After the grid matrix is determined according to the above method, a sparse processing operation may be performed on the elements in the grid matrix according to the size information of the object to be identified in the target scene to generate a corresponding sparse matrix.

The size information about the object to be recognized may be acquired in advance. Here, the size information of the object to be recognized may be determined in combination with the image data synchronously collected from the target point cloud data, and the size information of the object to be recognized may also be roughly estimated based on specific application scenarios. Identify the size information of the object. For example, for the field of automatic driving, the object in front of the vehicle can be a vehicle, and its general size information can be determined to be 4m×4m. Besides, the embodiment of the present disclosure may also determine the size information of the object to be recognized based on other manners, which is not specifically limited in the embodiment of the present disclosure.

In this embodiment of the present disclosure, the related sparse processing operation may be performing at least one expansion processing operation on the target element in the grid matrix (that is, the element representing the existence of the target point at the corresponding grid), and the expansion processing operation here may be performed on the grid matrix. It is performed when the coordinate range of the grid matrix is smaller than the size of the object to be recognized in the target scene, that is, through one or more expansion processing operations, the range of elements representing the existence of the target point at the corresponding grid can be performed step by step. expansion, so that the expanded element range can be matched with the object to be identified, thereby realizing the determination of the position; in addition, the sparse processing operation in the embodiment of the present disclosure may also be performed on the target element in the grid matrix at least A corrosion processing operation, where the corrosion processing operation can be performed when the coordinate range of the grid matrix is larger than the size of the object to be identified in the target scene, that is, through one or more corrosion processing operations, the representation can be The range of elements in which the target point exists at the corresponding grid is gradually reduced, so that the reduced range of elements can be matched with the object to be identified, thereby realizing the determination of the position.

In a specific application, which of the following operations is performed: one expansion processing operation, multiple expansion processing operations, one erosion processing operation, and multiple erosion processing operations, depending on the sparse matrix obtained by performing at least one shift processing and logic operation processing Whether the difference between the coordinate range of the target scene and the size of the object to be recognized in the target scene belongs to the preset threshold range, that is, the expansion or erosion processing operation adopted in the present disclosure is based on the constraint of the size information of the object to be recognized to make the information represented by the determined sparse matrix more consistent with the relevant information of the object to be identified.

It can be understood that the purpose of the sparse processing whether based on the dilation processing operation or the erosion processing operation is to enable the generated sparse matrix to represent more accurate relevant information of the object to be identified.

In the embodiment of the present disclosure, the above-mentioned dilation processing operation may be implemented based on a shift operation and a logical OR operation, or may be implemented based on convolution followed by negation, and negation after convolution. The two operations use different methods, but the final result of the resulting sparse matrix can be consistent.

In addition, the above-mentioned erosion processing operation may be implemented based on a shift operation and a logical AND operation, or may be implemented directly based on a convolution operation. Similarly, although the two operations use different methods, the final result of the generated sparse matrix can be the same.

Next, taking the dilation processing operation as an example, in conjunction with the specific example diagrams of generating the sparse matrix shown in FIGS. 5A to 5B , the above-mentioned generation process of the sparse matrix is further described.

5A is a schematic diagram of a grid matrix obtained after grid processing (corresponding to before uncoding), by performing an eight-neighborhood analysis on each target element (corresponding to a grid with a filling effect) in the grid matrix once Dilation operation, that is, the corresponding sparse matrix 5B can be obtained. It can be seen that, for the target element with the target point at the corresponding grid in 5A, the embodiment of the present disclosure performs an eight-neighbor expansion operation, so that each target element becomes an element set after expansion, and the element The grid width corresponding to the set may match the size of the object to be identified.

Among them, the expansion operation of the above-mentioned eight neighborhoods may be a process of determining an element whose absolute value of the difference between the abscissa or ordinate of the above-mentioned target element does not exceed 1. Except for the elements at the edge of the grid, generally all elements in the neighborhood of an element are There are eight elements (corresponding to the above element set), the input of the expansion processing result can be the coordinate information of the six target elements, and the output can be the coordinate information of the element set in the eight neighborhoods of the target element, as shown in FIG. 5B .

It should be noted that, in practical applications, in addition to the above-mentioned eight-neighbor expansion operation, a four-neighbor expansion operation can also be performed, and the latter and other expansion operations are not specifically limited herein. In addition, the embodiment of the present disclosure can also perform multiple dilation operations. For example, based on the dilation result shown in FIG. 5B , the dilation operation is performed again to obtain a sparse matrix with a larger range of element sets. No longer.

In the embodiment of the present disclosure, based on the generated sparse matrix, the position of the object to be identified in the target scene can be determined. The embodiments of the present disclosure can be specifically implemented through the following two aspects.

The first aspect: The position range of the object to be identified can be determined based on the correspondence between each element in the grid matrix and the coordinate range information of each target point. Specifically, the following steps can be used to achieve:

Step 1: Based on the correspondence between each element in the grid matrix and the coordinate range information of each target point, determine the coordinate information of the target point corresponding to each target element in the generated sparse matrix;

Step 2: Combine the coordinate information of the target points corresponding to each target element in the sparse matrix to determine the position of the object to be identified in the target scene.

Here, based on the above description of the rasterization process, it can be known that each target element in the grid matrix may correspond to multiple target points. In this way, the coordinate range information of the target points corresponding to the relevant elements and the multiple target points may be preset. definite. Here, still taking the grid matrix of N*M dimension as an example, the target elements with target points can correspond to P target points, the coordinates of each point are (Xi, Yi), i belongs to 0 to P-1, Xi, Yi represents the position of the target point in the grid matrix, 0<=Xi<N, 0<=Yi<M.

In this way, after the sparse matrix is generated, the coordinate information of the target point corresponding to each target element in the sparse matrix can be determined based on the predetermined correspondence between the above-mentioned elements and the coordinate range information of each target point. That is, the processing operation of de-rasterization is performed.

It should be noted that, since the sparse matrix is obtained based on the sparse processing of the elements in the grid matrix that represent the target points at the corresponding grids, the values of the target elements in the sparse matrix here can also represent the corresponding A target point exists at the grid.

In order to facilitate the understanding of the processing process of the above de-rasterization, a specific description may be given below with reference to an example. Here point A'(0,0) and point B'(0,0) indicated by the sparse matrix are in the first row and first column of the grid; point C'(2,3) is in the second row and third column Taking the grid as an example, in the process of de-rasterization, after the first grid (0,0) uses its center to map back to the Cartesian coordinate system, we can get (0.5m, 0.5m), the second row The grid (2,3) in the third column, using its center to map back to the Cartesian coordinate system, can get (2.5m, 3.5m), that is, (0.5m, 0.5m) and (2.5m, 3.5m) It is determined as the mapped coordinate information, so that the position of the object to be identified in the target scene can be determined by combining the mapped coordinate information.

The embodiments of the present disclosure can not only determine the location range of the object to be recognized based on the approximate relationship between the sparse matrix and the target detection result, but also determine the location range of the object to be recognized based on the trained convolutional neural network.

Second aspect: the embodiments of the present disclosure may first perform at least one convolution process on the generated sparse matrix based on the trained convolutional neural network, and then determine the position range of the object to be recognized based on the convolution result obtained by the convolution process.

In the related technology of using a convolutional neural network to achieve target detection, it is necessary to traverse all the input data, sequentially find the adjacent points of the input point to perform the convolution operation, and finally output the set of all the field points. The method only needs to quickly traverse the target elements in the sparse matrix to find the position of the valid point (that is, the element whose value is 1 in the zero-one matrix) and perform the convolution operation, thus greatly speeding up the calculation process of the convolutional neural network and improving the The efficiency of the location range determination of the object to be identified.

Considering the key role of the sparse processing operation on the point cloud data processing method provided by the embodiments of the present disclosure, the following two aspects can be respectively described below.

The first aspect: when the sparse processing operation is a dilation processing operation, the embodiments of the present disclosure can be implemented by combining shift processing and logical operations, and can also be implemented based on inversion followed by convolution, and convolution followed by inversion.

First, in the embodiment of the present disclosure, one or more expansion processing operations may be performed based on at least one shift processing and logical OR operation. The size information of the object is determined.

Here, for the first expansion processing operation, the target element representing the existence of the target point at the corresponding grid can be shifted in multiple preset directions to obtain a plurality of corresponding shifted grid matrices. The grid matrix and the plurality of shifted grid matrices corresponding to the first expansion processing operation are logically ORed, so that the sparse matrix after the first expansion processing operation can be obtained. Here, it can be judged whether the coordinate range of the obtained sparse matrix is less than The size of the object to be identified, and whether the corresponding difference is large enough (for example, greater than a preset threshold), if so, the target element in the sparse matrix after the first expansion processing operation can be shifted in multiple preset directions according to the above method. Bit processing and logical OR operation to obtain the sparse matrix after the second expansion processing operation, and so on, until it is determined that the difference between the coordinate range of the newly obtained sparse matrix and the size of the object to be identified in the target scene belongs to the preset value. When the threshold range is set, the sparse matrix is determined.

It should be noted that, no matter which dilation operation is obtained, the sparse matrix is essentially a zero-one matrix. With the increase of the number of expansion processing operations, the number of target elements in the obtained sparse matrix representing the existence of target points at the corresponding grid also increases, and since the grid mapped by the zero-one matrix has width information, Here, the coordinate range corresponding to each target element in the sparse matrix can be used to verify whether the size of the object to be recognized in the target scene is reached, thereby improving the accuracy of subsequent target detection applications.

The above logical OR operation can be implemented according to the following steps:

Step 1. Select a shifted grid matrix from a plurality of shifted grid matrices;

Step 2. Perform a logical OR operation on the grid matrix before the current expansion processing operation and the selected shifted grid matrix to obtain an operation result;

Step 3: Repeat the steps of selecting grid matrices that are not involved in the operation from the shifted grid matrices, and performing a logical OR operation on the selected grid matrix and the result of the latest operation, until all the grid matrices are selected. Grid matrix to get the sparse matrix after the current dilation operation.

Here, firstly, a shifted grid matrix can be selected from the shifted grid matrices. In this way, the grid matrix before the current expansion processing operation can be compared with the selected shifted grid matrix. Perform a logical OR operation on the matrix to obtain the operation result. Here, you can repeat the steps of selecting grid matrices that are not involved in the operation from the shifted grid matrices, and participating in the logical OR operation, until all the grid matrices are selected. After shifting the grid matrix, the sparse matrix after the current expansion processing operation can be obtained.

The expansion processing operation in this embodiment of the present disclosure may be a four-neighbor expansion operation centered on the target element, an eight-neighbor expansion operation centered on the target element, or other neighborhood processing operation methods. In a specific application, a corresponding neighborhood processing operation mode may be selected based on the size information of the object to be recognized, which is not specifically limited here.

It should be noted that for different neighborhood processing operation modes, the corresponding preset directions of the shift processing are not the same. Taking the expansion operation of four neighborhoods as an example, the grid matrix can be shifted according to the four preset directions. Bit processing, which are left shift, right shift, up shift and down shift. Taking the expansion operation of eight neighborhoods as an example, the grid matrix can be shifted according to eight preset directions, respectively left shift, right shift. Move, move up, move down, move up and down under the premise of moving left, and move up and down under the premise of moving right. In addition, in order to adapt to the subsequent logical OR operation, after determining the shifted grid matrix based on multiple shift directions, first perform a logical OR operation, and then perform multiple logical OR operations on the result. The shift operation in the shift direction is performed, and then the next logical OR operation is performed, and so on, until the dilated sparse matrix is obtained.

In order to facilitate the understanding of the above expansion processing operation, the grid matrix before encoding shown in FIG. 5A can be converted into the grid matrix after encoding as shown in FIG. 5C , and then the first expansion processing operation is performed in conjunction with FIG. 6A to FIG. 6B . Example description.

The grid matrix shown in FIG. 5C is taken as a zero-one matrix, the positions of all "1"s in the matrix can represent the grid where the target element is located, and all the "0"s in the matrix can represent the background.

In the embodiment of the present disclosure, firstly, the matrix shift may be used to determine the neighborhood of all elements in the zero-one matrix whose element value is 1. Here you can define the shift processing of four preset directions, namely left shift, right shift, up shift and down shift. Among them, the left shift means that the column coordinates corresponding to the elements with the element value of 1 in the zero-one matrix are subtracted by one, as shown in Figure 6A; the right-shift means that the column coordinates corresponding to all the elements in the zero-one matrix with the element value of 1 are added by one; Moving up means adding one to the row coordinates corresponding to all elements whose value is 1 in the zero-one matrix; moving down means adding one to the row coordinates corresponding to all elements in the zero-one matrix having a value of 1.

Second, embodiments of the present disclosure may combine the results of all neighborhoods using a matrix logical OR operation. Matrix logical OR operation, that is, in the case of receiving two sets of zero-one matrix inputs with the same size, perform logical OR operation on the zero-one in the same position of the two sets of matrices in turn, and the obtained result forms a new zero-one matrix as the output, A specific example of a logical OR operation is shown in FIG. 6B .

In the specific process of implementing the logical OR operation, the left-shifted grid matrix, the right-shifted grid matrix, the up-shifted grid matrix, and the down-shifted grid matrix can be sequentially selected to participate in the logical OR operation middle. For example, you can first perform a logical OR operation on the grid matrix with the grid matrix after shifting to the left, and the obtained operation result can perform a logical OR operation with the grid matrix after shifting right. The subsequent grid matrix is subjected to a logical OR operation, and the obtained operation result can be subjected to a logical OR operation with the grid matrix after the downshift, so as to obtain the sparse matrix after the first expansion processing operation.

It should be noted that the above-mentioned selection order of the grid matrix after translation is only a specific example. In practical applications, it can also be selected in combination with other methods. The logical OR operation is performed after the paired down shift, and the logical OR operation is performed after the left shift and the right shift are paired. The two logical OR operations can be performed synchronously, which can save computing time.

Second, in the embodiment of the present disclosure, the expansion processing operation can be implemented by combining convolution and two inversion processing. Specifically, the following steps can be implemented:

Step 1: Perform a first inversion operation on the elements in the grid matrix before the current expansion processing operation to obtain the grid matrix after the first inversion operation;

Step 2: Perform at least one convolution operation on the grid matrix after the first inversion operation based on the first preset convolution check to obtain a grid matrix with a preset sparsity after at least one convolution operation; the preset sparsity Determined by the size information of the object to be recognized in the target scene;

Step 3: Perform a second inversion operation on the elements in the grid matrix with the preset sparsity after at least one convolution operation to obtain a sparse matrix.

In the embodiment of the present disclosure, the expansion processing operation can be realized by the operations of convolution followed by inversion and inversion after convolution, and the obtained sparse matrix can also represent the relevant information of the object to be recognized to a certain extent. The above convolution operation can be automatically combined with the convolutional neural network used in subsequent applications such as target detection, so the detection efficiency can be improved to a certain extent.

In this embodiment of the present disclosure, the inversion operation may be implemented based on a convolution operation, or may be implemented based on other inversion operation modes. In order to facilitate cooperation with subsequent application networks (eg, a convolutional neural network used for target detection), a convolution operation can be used to implement the specific implementation. Next, the above-mentioned first inversion operation will be specifically described.

Here, the convolution operation can be performed on other elements except the target element in the grid matrix before the current expansion processing operation based on the second preset convolution check to obtain the first inversion element, and the second preset convolution can also be based on kernel, perform the convolution operation on the target element in the grid matrix before the current expansion processing operation, and obtain the second inversion element. Based on the above-mentioned first inversion element and second inversion element, the first inversion element can be determined. The raster matrix after the operation.

For the implementation process of the second inversion operation, reference may be made to the implementation process of the above-mentioned first inversion operation, which will not be repeated here.

In the embodiment of the present disclosure, at least one convolution operation may be performed on the grid matrix after the first inversion operation by using the first preset convolution check, so as to obtain a grid matrix with a preset sparsity. If the expansion processing operation can be used as a means of increasing the number of target elements in the grid matrix, the above convolution operation can be regarded as a process of reducing the number of target elements in the grid matrix (corresponding to the erosion processing operation) , since the convolution operation in the embodiment of the present disclosure is performed on the grid matrix after the first inversion operation, using the inversion operation combined with the erosion processing operation, and then performing the inversion operation again is equivalent to the above expansion The equivalent operation of the processing operation.

Wherein, for the first convolution operation, the grid matrix after the first inversion operation is subjected to a convolution operation with the first preset convolution kernel to obtain the grid matrix after the first convolution operation. After judging the first convolution operation After the sparsity of the grid matrix does not reach the preset sparsity, the grid matrix after the first convolution operation and the first preset convolution kernel can be convolved again to obtain the grid matrix after the second convolution operation. lattice matrix, and so on, until a lattice matrix with a preset sparsity can be determined.

The above sparsity may be determined by the proportion distribution of target elements and non-target elements in the grid matrix. The smaller the proportion of the elements, the smaller the size information of the object to be identified corresponding to the representation, and the embodiment of the present disclosure may stop the convolution operation when the proportion distribution reaches a preset sparsity.

The convolution operation in the embodiment of the present disclosure may be one time or multiple times. Here, the specific operation process of the first convolution operation can be described, including the following steps:

Step 1: For the first convolution operation, select each grid sub-matrix from the grid matrix after the first inversion operation according to the size of the first preset convolution kernel and the preset step size;

Step 2: For each selected grid sub-matrix, perform a product operation on the grid sub-matrix and the weight matrix to obtain a first operation result, and perform an addition operation on the first operation result and the offset to obtain a second operation result. operation result;

Step 3: Determine the grid matrix after the first convolution operation based on the second operation result corresponding to each grid sub-matrix.

Here, the grid matrix after the first inversion operation can be traversed in a traversal manner, so that for each grid sub-matrix traversed, the grid sub-matrix and the weight matrix can be multiplied to obtain the first operation result, and add the first operation result and the offset to obtain the second operation result. In this way, the second operation result corresponding to each grid sub-matrix is combined into the corresponding matrix elements, and the first operation result can be obtained. The grid matrix after the convolution operation.

In order to facilitate the understanding of the above expansion processing operation, the encoded grid matrix shown in FIG. 5C is still taken as an example here, and the expansion processing operation is illustrated in conjunction with FIGS. 7A to 7B .

Here, a 1*1 convolution kernel (that is, a second preset convolution kernel) can be used to implement the first inversion operation. The weight of the second preset convolution kernel is -1 and the offset is 1. This When substituting the weights and offsets into the convolution formula {output=input grid matrix*weight+offset}, if the input is the target element in the grid matrix, and its value corresponds to 1, the output =1*-1+1=0; if the input is a non-target element in the grid matrix, and its value corresponds to 0, then the output=0*-1+1=1; in this way, after the action of the 1*1 convolution kernel Depending on the input, the zero-one matrix can be inverted, the element value 0 becomes 1, and the element value 1 becomes 0, as shown in FIG. 7A .

For the above corrosion processing operation, in a specific application, a 3*3 convolution kernel (ie, the first preset convolution kernel) and a linear rectification function (Rectified Linear Unit, ReLU) can be used to implement. Each weight value included in the above-mentioned first preset convolution kernel weight value matrix is 1, and the offset is 8. In this way, the formula {output=ReLU(input grid matrix after the first inversion operation* weight + bias)} to achieve the above-mentioned corrosion processing operation.

Here, only when all elements in the input 3*3 grid sub-matrix are 1, output=ReLU(9-8)=1; otherwise, output=ReLU(input grid sub-matrix*1-8 )=0, where (input grid sub-matrix*1-8)<0, as shown in FIG. 7B is the grid matrix after the convolution operation.

Here, each nested layer of the convolutional network with the second preset convolution kernel can superimpose an erosion operation, so that a grid matrix with a fixed sparsity can be obtained, and the inversion operation again can be equivalent to an expansion processing operation. Thereby, the generation of sparse matrix can be realized.

The second aspect: in the case where the sparse processing operation is an erosion processing operation, the embodiments of the present disclosure may be implemented in combination with shift processing and logical operations, and may also be implemented based on convolution operations.

First, in the embodiment of the present disclosure, one or more corrosion processing operations can be performed based on at least one shift processing and logical AND operation. In the specific implementation process, the specific number of corrosion processing operations can be combined with the target scene to be identified. The size information of the object is determined.

Similar to the expansion processing based on shift processing and logical OR operation in the first aspect, in the process of performing the erosion processing operation, the grid matrix shift processing can also be performed first. The difference from the above expansion processing is that here The logical operation of , which can be a logical AND operation on the shifted grid matrix. For the process of implementing the corrosion processing operation based on the shift processing and the logical AND operation, refer to the above description for details, and details are not repeated here.

Similarly, the corrosion processing operation in the embodiment of the present disclosure may be four-neighbor corrosion centered on the target element, eight-area corrosion centered on the target element, or other field processing operations. In specific applications , the corresponding domain processing operation mode can be selected based on the size information of the object to be recognized, which is not specifically limited here.

Second, in the embodiment of the present disclosure, the erosion processing operation can be implemented in combination with the convolution processing, which can be specifically implemented by the following steps:

Step 1: Perform at least one convolution operation on the grid matrix based on the third preset convolution check to obtain a grid matrix with a preset sparsity after at least one convolution operation; the preset sparsity is determined by the target scene to be identified. The size information of the object is determined;

Step 2: Determine the grid matrix with the preset sparsity after at least one convolution operation as the sparse matrix corresponding to the object to be recognized.

The above convolution operation can be regarded as a process of reducing the number of target elements in the grid matrix, that is, an erosion process. Among them, for the first convolution operation, the grid matrix and the first preset convolution kernel are subjected to convolution operation to obtain the grid matrix after the first convolution operation, and the sparsity of the grid matrix after the first convolution operation is judged. After the preset sparsity is not reached, the grid matrix after the first convolution operation and the third preset convolution kernel can be convolved again to obtain the grid matrix after the second convolution operation, and so on. Until a grid matrix with a preset sparsity can be determined, that is, a sparse matrix corresponding to the object to be recognized is obtained.

The convolution operation in this embodiment of the present disclosure may be performed once or multiple times. For the specific process of the convolution operation, please refer to the relevant description of implementing expansion processing based on convolution and inversion in the first aspect above, which will not be repeated here.

It should be noted that, in specific applications, convolutional neural networks with different data processing bit widths can be used to generate sparse matrices. For example, 4 bits can be used to represent the input, output, and computational parameters of the network Parameters, such as the element value (0 or 1) of the grid matrix, weights, offsets, etc., in addition, can also be represented by 8bit to adapt to the network processing bit width and improve the operation efficiency.

Based on the above method, the point cloud data to be processed collected by the radar device in the target scene can be screened based on the effective perception range information corresponding to the target scene, and the screened target point cloud data is the corresponding valid point cloud data in the target scene. Therefore, based on the filtered target point cloud data, the detection calculation is performed in the target scene, which can reduce the amount of calculation, improve the calculation efficiency, and the utilization rate of computing resources in the target scene.

Those skilled in the art can understand that in the above method of the specific implementation, the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.

Based on the same inventive concept, the embodiment of the present disclosure also provides a point cloud data processing device corresponding to the point cloud data processing method. Similar, therefore, the implementation of the apparatus may refer to the implementation of the method, and repeated descriptions will not be repeated.

Referring to FIG. 8 , which is a schematic diagram of the architecture of a point cloud data processing device provided by an embodiment of the present disclosure, the device includes: an acquisition module 801 , a screening module 802 , and a detection module 803 ; wherein,

an acquisition module 801, configured to acquire point cloud data to be processed obtained by scanning the radar device in the target scene;

A screening module 802, configured to screen out target point cloud data from the to-be-processed point cloud data according to the effective perception range information corresponding to the target scene;

The detection module 803 is configured to detect the target point cloud data to obtain a detection result.

In a possible implementation manner, the screening module 802 is further configured to determine the effective perception range information corresponding to the target scene according to the following manner:

Obtain the computing resource information of the processing device;

Based on the computing resource information, the effective sensing range information matched with the computing resource information is determined.

In a possible implementation manner, the screening module 802, when screening out target point cloud data from the to-be-processed point cloud data according to the effective perception range information corresponding to the target scene, is used for:

determining an effective coordinate range based on the effective sensing range information;

Based on the valid coordinate range, target point cloud data is filtered out from the point cloud data to be processed.

In a possible implementation manner, the screening module 802, when determining the effective coordinate range based on the effective sensing range information, is used to:

Based on the position information of the reference position point within the effective perception range and the coordinate information of the reference position point in the target scene, the effective coordinate range corresponding to the target scene is determined.

In a possible implementation manner, the screening module 802, when screening out target point cloud data from the to-be-processed point cloud data based on the valid coordinate range, is used to:

The radar scanning points whose corresponding coordinate information is located within the effective coordinate range are used as the radar scanning points in the target point cloud data.

In a possible implementation, the screening module 802 is further configured to determine the coordinate information of the reference position point in the target scene according to the following manner:

obtaining the location information of the intelligent driving equipment in which the radar device is set;

Determine the road type of the road where the intelligent driving device is located based on the location information of the intelligent driving device;

The coordinate information of the reference position point matching the road type is obtained.

In a possible implementation manner, the detection result includes the position of the object to be identified in the target scene;

The detection module 803, when detecting the target point cloud data and obtaining a detection result, is used for:

Perform grid processing on the target point cloud data to obtain a grid matrix; the value of each element in the grid matrix is used to represent whether there is a target point at the corresponding grid;

generating a sparse matrix corresponding to the object to be identified according to the grid matrix and the size information of the object to be identified in the target scene;

Based on the generated sparse matrix, the position of the object to be identified in the target scene is determined.

In a possible implementation manner, the detection module 803 is used for generating a sparse matrix corresponding to the object to be identified according to the grid matrix and the size information of the object to be identified in the target scene. :

According to the grid matrix and the size information of the object to be identified in the target scene, at least one expansion processing operation or erosion processing operation is performed on the target element in the grid matrix to generate a corresponding object to be identified. sparse matrix;

Wherein, the value of the target element indicates that the target point exists at the corresponding grid.

In a possible implementation manner, the detection module 803, when performing the expansion processing operation or the erosion processing operation, is used for: shift processing and logical operation processing, and the coordinate range of the sparse matrix is the same as that of the object to be identified. The difference between the sizes is within a preset threshold.

In a possible implementation manner, the detection module 803 performs at least one expansion processing operation on the elements in the grid matrix according to the grid matrix and the size information of the object to be identified in the target scene. , when generating a sparse matrix corresponding to the object to be identified, used for:

Perform a first inversion operation on the elements in the grid matrix before the current expansion processing operation to obtain the grid matrix after the first inversion operation;

Perform at least one convolution operation on the grid matrix after the first inversion operation based on the first preset convolution check to obtain a grid matrix with a preset sparsity after at least one convolution operation; the preset sparsity The degree is determined by the size information of the object to be recognized in the target scene;

A second inversion operation is performed on the elements in the grid matrix with the preset sparsity after the at least one convolution operation to obtain the sparse matrix.

In a possible implementation manner, the detection module 803 performs the first inversion operation on the elements in the grid matrix before the current expansion processing operation to obtain the grid matrix after the first inversion operation, using At:

Based on the second preset convolution kernel, a convolution operation is performed on other elements except the target element in the grid matrix before the current expansion processing operation to obtain the first inversion element, and based on the second preset convolution kernel, perform the convolution operation on the target element in the grid matrix before the current expansion processing operation to obtain the second inversion element;

Based on the first inversion element and the second inversion element, a grid matrix after the first inversion operation is obtained.

In a possible implementation manner, the detection module 803 performs at least one convolution operation on the grid matrix after the first inversion operation based on the first preset convolution check, to obtain at least one convolution operation. When a raster matrix with a preset sparsity is used:

For the first convolution operation, performing a convolution operation on the grid matrix after the first inversion operation and the first preset convolution kernel to obtain the grid matrix after the first convolution operation;

Repeat the steps of performing the convolution operation on the grid matrix after the previous convolution operation with the first preset convolution kernel to obtain the grid matrix after the current convolution operation, until a grid matrix with a preset sparsity is obtained. grid matrix.

In a possible implementation manner, the detection module 803 has a weight matrix and an offset corresponding to the weight matrix in the first preset convolution kernel; for the first convolution operation, the first The grid matrix after the reverse operation is subjected to a convolution operation with the first preset convolution kernel, and when the grid matrix after the first convolution operation is obtained, it is used for:

For the first convolution operation, according to the size of the first preset convolution kernel and the preset step size, each grid sub-matrix is selected from the grid matrix after the first inversion operation;

For each selected grid sub-matrix, perform a product operation on the grid sub-matrix and the weight matrix to obtain a first operation result, and add the first operation result and the offset operation to obtain the second operation result;

Based on the second operation result corresponding to each of the grid sub-matrixes, the grid matrix after the first convolution operation is determined.

In a possible implementation manner, the detection module 803 performs at least one corrosion processing operation on the elements in the grid matrix according to the grid matrix and the size information of the object to be identified in the target scene. , when generating a sparse matrix corresponding to the object to be identified, used for:

Perform at least one convolution operation on the grid matrix to be processed based on the third preset convolution kernel to obtain a grid matrix with a preset sparsity after at least one convolution operation; the preset sparsity is determined by the target scene to determine the size information of the object to be identified in;

The grid matrix with the preset sparsity after the at least one convolution operation is determined as the sparse matrix corresponding to the object to be identified.

In a possible implementation manner, the detection module 803, when performing grid processing on the target point cloud data to obtain a grid matrix, is used for:

Perform grid processing on the target point cloud data to obtain a grid matrix and the corresponding relationship between each element in the grid matrix and the coordinate range information of each target point.

The detection module 803, when determining the position range of the object to be identified in the target scene based on the generated sparse matrix, is used for:

Based on the correspondence between each element in the grid matrix and the coordinate range information of each target point, determine the coordinate information of the target point corresponding to each target element in the generated sparse matrix;

The coordinate information of the target points corresponding to each of the target elements in the sparse matrix is combined to determine the position of the object to be identified in the target scene.

In a possible implementation manner, when determining the position of the object to be identified in the target scene based on the generated sparse matrix, the detection module 803 is configured to:

Perform at least one convolution process on each target element in the generated sparse matrix based on the trained convolutional neural network to obtain a convolution result;

Based on the convolution result, the position of the object to be identified in the target scene is determined.

In a possible implementation manner, the device further includes a control module 804, configured to: after detecting the target point cloud data and obtaining a detection result, control and set the intelligent driving of the radar device based on the detection result. equipment.

Based on the above device, the point cloud data to be processed collected by the radar device in the target scene can be screened based on the effective perception range information corresponding to the target scene, and the screened target point cloud data is the target point cloud data corresponding to the target scene Therefore, based on the filtered point cloud data, the detection calculation is performed in the target scene, which can reduce the amount of calculation, improve the calculation efficiency, and the utilization rate of computing resources in the target scene.

For the description of the processing flow of each module in the apparatus and the interaction flow between the modules, reference may be made to the relevant descriptions in the foregoing method embodiments, which will not be described in detail here.

Based on the same technical concept, referring to FIG. 9 , an embodiment of the present disclosure further provides a computer device, including a processor 901 , a memory 902 and a bus 903 . Among them, the memory 902 includes a memory 9021 and an external memory 9022 for storing execution instructions; the memory 9021 here is also called an internal memory, and is used for temporarily storing the operation data in the processor 901 and the data exchanged with the external memory 9022 such as a hard disk. , the processor 901 exchanges data with the external memory 9022 through the memory 9021, and when the computer device 900 is running, the processor 901 and the memory 902 communicate through the bus 903, so that the processor 901 executes the following instructions:

Obtain the point cloud data to be processed obtained by scanning the radar device in the target scene;

Screening out target point cloud data from the to-be-processed point cloud data according to the effective perception range information corresponding to the target scene;

Detecting the target point cloud data to obtain a detection result.

Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program stored on the computer program is executed by a processor to execute the point cloud data processing method described in the foregoing method embodiments. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

The computer program product of the method for processing point cloud data provided by the embodiments of the present disclosure includes a computer-readable storage medium storing program codes, and the instructions included in the program codes can be used to execute the point cloud data described in the above method embodiments. For the processing method, reference may be made to the foregoing method embodiments, and details are not described herein again.

Embodiments of the present disclosure also provide a computer program, which implements any one of the methods in the foregoing embodiments when the computer program is executed by a processor. The computer program product can be specifically implemented by hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.

Those skilled in the art can clearly understand that, for the convenience and brevity of description, for the specific working process of the system and device described above, reference may be made to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided by the present disclosure, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. The apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium. Based on this understanding, the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the parts that make contributions to the prior art or the parts of the technical solutions. The computer software products are stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Finally, it should be noted that the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure rather than limit them. The protection scope of the present disclosure is not limited thereto, although referring to the foregoing The embodiments describe the present disclosure in detail, and those skilled in the art should understand that: any person skilled in the art can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed by the present disclosure. Changes can be easily thought of, or equivalent replacements are made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be covered in the present disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be based on the protection scope of the claims.

Claims

A point cloud data processing method, comprising:

Obtain the point cloud data to be processed obtained by scanning the radar device in the target scene;

Screening out target point cloud data from the to-be-processed point cloud data according to the effective perception range information corresponding to the target scene;

Detecting the target point cloud data to obtain a detection result.
The method according to claim 1, wherein the effective perception range information corresponding to the target scene is determined according to the following manner:

acquiring computing resource information of a processing device that processes the point cloud data to be processed in the target scene;

Based on the computing resource information, the effective sensing range information matched with the computing resource information is determined.
The method according to claim 1 or 2, wherein the filtering out the target point cloud data from the to-be-processed point cloud data according to the effective perception range information corresponding to the target scene comprises:

determining an effective coordinate range based on the effective sensing range information;

Based on the valid coordinate range, target point cloud data is filtered out from the point cloud data to be processed.
The method according to claim 3, wherein the determining an effective coordinate range based on the effective sensing range information comprises:

Based on the position information of the reference position point within the effective perception range and the coordinate information of the reference position point in the target scene, the effective coordinate range corresponding to the target scene is determined.
The method according to claim 3, wherein the filtering out target point cloud data from the to-be-processed point cloud data based on the valid coordinate range comprises:

Each radar scan point whose coordinate information in the point cloud data to be processed is located within the effective coordinate range is taken as a radar scan point in the target point cloud data.
The method according to claim 4, wherein the coordinate information of the reference position point in the target scene is determined according to the following manner:

obtaining the location information of the intelligent driving equipment in which the radar device is set;

Determine the road type of the road where the intelligent driving device is located based on the location information of the intelligent driving device;

The coordinate information of the reference position point matching the road type is acquired as the coordinate information of the reference position point in the target scene.
The method according to claim 1, wherein the detection result includes the position of the object to be identified in the target scene; and the detection of the target point cloud data to obtain the detection result includes:

Perform grid processing on the target point cloud data to obtain a grid matrix; the value of each element in the grid matrix is used to represent whether there is a target point at the corresponding grid;

generating a sparse matrix corresponding to the object to be identified according to the grid matrix and the size information of the object to be identified in the target scene;

Based on the generated sparse matrix, the position of the object to be identified in the target scene is determined.
The method according to claim 7, wherein generating a sparse matrix corresponding to the to-be-identified object according to the grid matrix and the size information of the to-be-identified object in the target scene comprises:

According to the grid matrix and the size information of the object to be identified in the target scene, at least one expansion processing operation or erosion processing operation is performed on the target element in the grid matrix to generate a corresponding object to be identified. sparse matrix;

Wherein, the value of the target element indicates that the target point exists at the corresponding grid.
The method according to claim 8, wherein the expansion treatment operation or the corrosion treatment operation comprises:

Shift processing and logical operation processing,

The difference between the coordinate range of the sparse matrix and the size of the object to be identified is within a preset threshold range.
The method according to claim 8, wherein, according to the grid matrix and the size information of the object to be identified in the target scene, at least one expansion processing operation is performed on the elements in the grid matrix to generate The sparse matrix corresponding to the object to be identified, including:

Perform a first inversion operation on the elements in the grid matrix before the current expansion processing operation to obtain the grid matrix after the first inversion operation;

Perform at least one convolution operation on the grid matrix after the first inversion operation based on the first preset convolution check to obtain a grid matrix with a preset sparsity after at least one convolution operation;

A second inversion operation is performed on the elements in the grid matrix with the preset sparsity after the at least one convolution operation to obtain the sparse matrix.
The method according to claim 10, wherein the performing a first inversion operation on the elements in the grid matrix before the current expansion processing operation to obtain the grid matrix after the first inversion operation, comprising:

Based on the second preset convolution kernel, perform a convolution operation on other elements except the target element in the grid matrix before the current expansion processing operation to obtain the first inversion element;

Based on the second preset convolution kernel, a convolution operation is performed on the target element in the grid matrix before the current expansion processing operation to obtain a second inversion element;

Based on the first inversion element and the second inversion element, a grid matrix after the first inversion operation is obtained.
The method according to claim 10 or 11, wherein at least one convolution operation is performed on the grid matrix after the first inversion operation based on the first preset convolution check to obtain at least one convolution operation The resulting raster matrix with preset sparsity, including:

For the first convolution operation, performing a convolution operation on the grid matrix after the first inversion operation and the first preset convolution kernel to obtain the grid matrix after the first convolution operation;

Repeat the steps of performing the convolution operation on the grid matrix after the previous convolution operation with the first preset convolution kernel to obtain the grid matrix after the current convolution operation, until obtaining the grid matrix with the preset sparseness A raster matrix of degrees.
The method according to claim 12, wherein the first preset convolution kernel has a weight matrix and an offset corresponding to the weight matrix; for the first convolution operation, the first convolution kernel is Perform a convolution operation on the grid matrix after the inversion operation and the first preset convolution kernel to obtain the grid matrix after the first convolution operation, including:

For the first convolution operation, according to the size of the first preset convolution kernel and the preset step size, each grid sub-matrix is selected from the grid matrix after the first inversion operation;

For each selected grid sub-matrix, perform a product operation on the grid sub-matrix and the weight matrix to obtain a first operation result;

performing an addition operation on the first operation result and the offset to obtain a second operation result;

Based on the second operation result corresponding to each of the grid sub-matrixes, the grid matrix after the first convolution operation is determined.
The method according to claim 8, characterized in that, according to the grid matrix and the size information of the object to be identified in the target scene, performing at least one erosion processing operation on the elements in the grid matrix to generate The sparse matrix corresponding to the object to be identified, including:

Perform at least one convolution operation on the grid matrix to be processed based on the third preset convolution kernel to obtain a grid matrix with a preset sparsity after at least one convolution operation;

The grid matrix with the preset sparsity after the at least one convolution operation is determined as the sparse matrix corresponding to the object to be identified.
The method according to any one of claims 7 to 14, wherein,

The rasterization process is performed on the target point cloud data to obtain a raster matrix, including:

Perform grid processing on the target point cloud data to obtain a grid matrix and the corresponding relationship between each element in the grid matrix and the coordinate range information of each target point;

The determining the position range of the object to be identified in the target scene based on the generated sparse matrix includes:

Based on the correspondence between each element in the grid matrix and the coordinate range information of each target point, determine the coordinate information of the target point corresponding to each target element in the generated sparse matrix;

The coordinate information of the target points corresponding to each of the target elements in the sparse matrix is combined to determine the position of the object to be identified in the target scene.
The method according to any one of claims 7 to 15, wherein the determining the position of the object to be identified in the target scene based on the generated sparse matrix comprises:

Perform at least one convolution process on each target element in the generated sparse matrix based on the trained convolutional neural network to obtain a convolution result;

Based on the convolution result, the position of the object to be identified in the target scene is determined.
The method according to any one of claims 1 to 16, wherein after detecting the target point cloud data and obtaining a detection result, the method further comprises:

The intelligent traveling equipment provided with the radar device is controlled based on the detection result.
A point cloud data processing device, comprising:

The acquisition module is used to acquire the point cloud data to be processed obtained by scanning the radar device in the target scene;

a screening module, configured to screen out target point cloud data from the to-be-processed point cloud data according to the effective perception range information corresponding to the target scene;

The detection module is used for detecting the target point cloud data to obtain a detection result.
A computer device includes a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the processor and the memory communicate through the bus, When the machine-readable instructions are executed by the processor, the method for processing point cloud data according to any one of claims 1 to 17 is performed.
A computer-readable storage medium having a computer program stored thereon, the computer program executing the point cloud data processing method according to any one of claims 1 to 17 when the computer program is run by a processor.