CN114445802A

CN114445802A - Point cloud processing method and device and vehicle

Info

Publication number: CN114445802A
Application number: CN202210110210.0A
Authority: CN
Inventors: 鞠波; 叶晓青; 谭啸; 孙昊
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-01-29
Filing date: 2022-01-29
Publication date: 2022-05-06

Abstract

The utility model provides a point cloud processing method, a device and a vehicle, which relate to the field of artificial intelligence, in particular to computer vision, 3D vision and deep learning technology, comprising the following steps: the method comprises the steps of obtaining three-dimensional point cloud and a two-dimensional image corresponding to the three-dimensional point cloud, performing semantic segmentation processing on the two-dimensional image to obtain a category label of a pixel point in the two-dimensional image, wherein the category label is used for representing the category of an obstacle corresponding to the pixel point, filtering the three-dimensional point cloud according to the category label to obtain the filtered three-dimensional point cloud, avoiding the defect of low filtering comprehensiveness caused by the fact that outliers are removed in the related technology, improving the filtering comprehensiveness, avoiding the defect of low accuracy caused by the fact that foreground and background are segmented in the related technology, improving the filtering accuracy and reliability, and improving the technical effect of filtering robustness.

Description

Point cloud processing method and device and vehicle

Technical Field

The utility model relates to an artificial intelligence field, concretely relates to computer vision, 3D vision and deep learning technique, specifically can be used to under smart city and the intelligent transportation scene, especially relate to a point cloud processing method, device and vehicle.

Background

The automatic driving system is a system which can automatically and safely operate a vehicle without any human active operation by means of cooperative cooperation of artificial intelligence, visual calculation, radar, a monitoring device and a global positioning system.

The radar can acquire point clouds, the automatic driving system can detect target objects such as obstacles based on the point clouds, and the point clouds are generally required to be filtered in order to improve the safety and reliability of vehicle driving. In the prior art, the commonly used filtration methods are: determining outliers in the point cloud and removing the outliers from the point cloud.

However, this method is based on the assumption of outliers, and only part of noise points can be removed, resulting in a technical problem that the reliability of filtering is low.

Disclosure of Invention

The disclosure provides a point cloud processing method and device for improving filtering reliability of three-dimensional point cloud and a vehicle.

According to a first aspect of the present disclosure, there is provided a point cloud processing method, the method comprising:

acquiring a three-dimensional point cloud and a two-dimensional image corresponding to the three-dimensional point cloud;

performing semantic segmentation processing on the two-dimensional image to obtain a category label of a pixel point in the two-dimensional image, wherein the category label is used for representing the category of an obstacle corresponding to the pixel point;

and filtering the three-dimensional point cloud according to the category label to obtain the filtered three-dimensional point cloud.

According to a second aspect of the present disclosure, there is provided a point cloud processing apparatus, the apparatus comprising:

the device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring a three-dimensional point cloud and a two-dimensional image corresponding to the three-dimensional point cloud;

the segmentation unit is used for performing semantic segmentation processing on the two-dimensional image to obtain a category label of a pixel point in the two-dimensional image, wherein the category label is used for representing the category of an obstacle corresponding to the pixel point;

and the filtering unit is used for filtering the three-dimensional point cloud according to the category label to obtain the filtered three-dimensional point cloud.

According to a third aspect of the present disclosure, there is provided an electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect.

According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method according to the first aspect.

According to a fifth aspect of the present disclosure, there is provided a computer program product comprising: a computer program, stored in a readable storage medium, from which at least one processor of an electronic device can read the computer program, execution of the computer program by the at least one processor causing the electronic device to perform the method of the first aspect.

According to a fifth aspect of the present disclosure, there is provided a vehicle including: the apparatus of the second aspect.

According to the point cloud processing method, the point cloud processing device and the vehicle, the category label of the pixel point is determined, and the technical characteristics of filtering the three-dimensional point cloud based on the category label are adopted, so that the defect that the filtering comprehensiveness is low due to the fact that outliers are removed in the related technology is avoided, the filtering comprehensiveness is improved, the defect that the accuracy is low due to the fact that foreground and background are segmented in the related technology is also avoided, the filtering accuracy and reliability are improved, and the technical effect of filtering robustness is improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;

FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a point cloud processing method according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram according to a third embodiment of the present disclosure;

FIG. 5 is a scene diagram of a point cloud processing method in which embodiments of the present disclosure may be implemented;

FIG. 6 is a schematic diagram according to a fourth embodiment of the present disclosure;

FIG. 7 is a schematic diagram according to a fifth embodiment of the present disclosure;

FIG. 8 is a schematic diagram according to a sixth embodiment of the present disclosure;

fig. 9 is a block diagram of an electronic device for implementing a point cloud processing method of an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The automatic driving system is a system for automatically and safely operating a vehicle, that is, based on the automatic driving system, the safe and automatic driving of the vehicle can be realized. The automatic driving system is realized by cooperation of different devices or equipment, such as radars (radio) playing a key role in the automatic driving system.

The Radar can be divided into different types, such as Laser Radar (lidar) and millimeter-wave Radar (millimeter-wave Radar), and the like, by means of the Radar, the automatic driving system can accurately perform real-time three-dimensional (3D) modeling on the environment where the vehicle is located, especially under some scenes of visual computing failure, such as fog, night, white vehicles and the like, so that the safety of the automatic driving system for controlling the vehicle is improved, and (3D) target objects, such as the positions, sizes, postures and the like of the vehicle, pedestrians and the like, can be accurately sensed. That is, by means of radar, a target detection task of detecting a target object can be achieved.

For example, the radar may collect a point cloud (the point cloud refers to a point data set on the appearance surface of the target object), and the automatic driving system may perform target detection based on the point cloud, so as to obtain the position, size, posture, and the like of the target object.

However, due to weather, environment and other reasons, noise (noise data or abnormal data) which is not related to target detection may exist in the point cloud acquired by the radar, and therefore, the noise needs to be removed, that is, the point cloud needs to be filtered.

In some embodiments, the point cloud may be filtered in a manner of "removing outliers," where outliers refer to extremely large and/or extremely small values that are far from the general value. For example:

and calculating the average distance from each point to the K points closest to the point for each point in the point cloud, correspondingly obtaining the average distance corresponding to each point, wherein the average distances of all the points in the point cloud can form Gaussian distribution (or normal distribution), determining a mean standard deviation (sigma) according to a preset mean value and variance, and rejecting extreme large values and/or extreme small values in the point cloud according to the mean standard deviation, such as rejecting points except 3 sigma.

The mean and the variance may be determined based on a demand, a history, an experiment, and the like, which is not limited in this embodiment.

However, when the point cloud is filtered in this way, noise is an assumption of outliers, and relatively speaking, only some local noise can be removed, but some noise of a large area of irrelevant background cannot be well covered.

In other embodiments, the point cloud may be filtered in a "foreground and background segmentation" manner, for example:

determining foreground (forego) point cloud and background (background) point cloud in the point cloud, and removing the background point cloud in the point cloud to realize filtering processing of the point cloud.

The point cloud can be segmented by adopting a segmentation network model to obtain a foreground point cloud and a background point cloud in the point cloud.

However, when the point cloud is filtered in this way, especially when the difference between the foreground and the background is small, the accuracy of the filtering process is low.

In some embodiments, to avoid noise in the point cloud, the generalization capability may be enhanced in a data enhancement manner to combat the noise, that is, the point cloud is not filtered, but the point cloud is enhanced in the data enhancement manner to combat the noise.

Obviously, this method does not substantially remove the noise in the point cloud, and when the detection of the target object is performed in combination with the enhanced point cloud, there may be a technical problem that the detection accuracy is low due to poor effect of resisting the noise.

In order to avoid at least one of the above technical problems, the inventors of the present disclosure have made creative efforts to obtain the inventive concept of the present disclosure: the method comprises the steps of obtaining a three-dimensional point cloud and a two-dimensional image corresponding to the three-dimensional point cloud, determining a class label of a pixel in the two-dimensional image (namely an obstacle class corresponding to the pixel), and filtering the three-dimensional point cloud based on the class label.

Based on the invention concept, the invention provides a point cloud processing method, a point cloud processing device and a vehicle, which are applied to the field of artificial intelligence, particularly relate to computer vision, 3D vision and deep learning technologies, and can be particularly used in smart cities and intelligent traffic scenes to achieve reliability and effectiveness of point cloud filtering.

Fig. 1 is a schematic diagram according to a first embodiment of the disclosure, and as shown in fig. 1, a method of point cloud processing of an embodiment of the disclosure includes:

s101: and acquiring the three-dimensional point cloud and a two-dimensional image corresponding to the three-dimensional point cloud.

For example, the executing subject of the present embodiment may be a point cloud processing device, and the point cloud processing device may be a computer, a server, an on-board terminal, a processor, a chip, and the like disposed in a vehicle, and the present embodiment is not limited thereto.

In some embodiments, a radar (e.g., a lidar as described in the above examples) may be disposed on the vehicle, and accordingly, the radar may acquire a three-dimensional point cloud (i.e., a 3D point cloud) and transmit the three-dimensional point cloud to the point cloud processing device.

Similarly, the vehicle may further be provided with an image acquisition device (such as a camera), and correspondingly, the image acquisition device may acquire a two-dimensional image (i.e., an RGB image) and transmit the two-dimensional image to the point cloud processing device.

It should be noted that the two-dimensional image is a two-dimensional image corresponding to the three-dimensional point cloud, that is, the target object represented by the two-dimensional image and the target object represented by the three-dimensional point cloud are the same target object.

For example, during the driving process of the vehicle, the radar can collect a three-dimensional point cloud in front of the vehicle, and the image collecting device can also collect a two-dimensional image in front of the vehicle. The three-dimensional point cloud represents the appearance surface of the target object in a point data set mode. The appearance surface of the target object is represented by the two-dimensional image in a pixel point mode.

S102: and performing semantic segmentation processing on the two-dimensional image to obtain a category label of a pixel point in the two-dimensional image.

The category label is used for representing the category of the obstacle corresponding to the pixel point.

Semantic segmentation (semantic segmentation) processing refers to segmenting according to the semantics of a two-dimensional image to label the types of obstacles for each pixel point in the two-dimensional image, so as to obtain the types of obstacles corresponding to the pixel points.

Illustratively, the two-dimensional image includes a plurality of pixel points, each pixel point may be used to represent points on different target objects, and may also represent points on the same target object, and for each pixel point, the obstacle category corresponding to the pixel point may be determined, that is, the obstacle category of the target object to which the pixel point belongs is determined.

The classes of obstacles include, but are not limited to: vehicles (e.g., other vehicles as the target object), pedestrians, and walls (walls).

S103: and filtering the three-dimensional point cloud according to the category label to obtain the filtered three-dimensional point cloud.

That is, carry out filtering process to the three-dimensional point cloud through the barrier classification, and carry out filtering process to the three-dimensional point cloud through the barrier classification, can make filtering process merge in the semanteme, carry out filtering process to the three-dimensional point cloud based on the semanteme promptly for filter and have higher reliability relatively, avoid filtering process's omission, also avoid reliable effectual data to be filtered, thereby improve the technical effect of filterable accuracy and comprehensiveness.

Based on the above analysis, the embodiment of the present disclosure provides a point cloud processing method, including: the method comprises the steps of obtaining a three-dimensional point cloud and a two-dimensional image corresponding to the three-dimensional point cloud, performing semantic segmentation processing on the two-dimensional image to obtain a category label of a pixel point in the two-dimensional image, wherein the category label is used for representing the category of an obstacle corresponding to the pixel point, and performing filtering processing on the three-dimensional point cloud according to the category label to obtain a filtered three-dimensional point cloud.

Fig. 2 is a schematic diagram according to a second embodiment of the disclosure, and as shown in fig. 2, the point cloud processing method of the embodiment of the disclosure includes:

s201: and acquiring the three-dimensional point cloud and a two-dimensional image corresponding to the three-dimensional point cloud.

It should be understood that, in order to avoid redundant explanation, the technical features of the present embodiment that are the same as those of the above-described embodiment are not described again in this embodiment.

S202: and performing semantic segmentation processing on the two-dimensional image based on a pre-trained semantic segmentation model to obtain a category label of a pixel point in the two-dimensional image.

The semantic segmentation model is generated based on a sample data set, and the sample data set comprises a plurality of sample two-dimensional images.

Similarly, the number of the sample two-dimensional images may be determined based on requirements, history, experiments, and the like, and the embodiment is not limited.

For example, for scenes with relatively high semantic segmentation processing requirements, the number of sample two-dimensional images may be relatively large; conversely, for scenes with relatively low semantic segmentation processing requirements, the number of sample two-dimensional images may be relatively small.

The sample two-dimensional image can be an image acquired when each vehicle runs on a road surface, and the basic network is trained by adopting a sample data set to generate a semantic segmentation model.

As shown in fig. 3, after the two-dimensional image is acquired, the two-dimensional image may be input to a semantic segmentation model and a category label may be output.

In this embodiment, by training the semantic segmentation model in advance to determine the category label of the pixel point in the two-dimensional image based on the semantic segmentation model, the efficiency and accuracy of determining the category label can be improved, and further, when the three-dimensional point cloud is filtered based on the category label, the technical effects of reliability and effectiveness of filtering processing are improved.

S203: and converting the three-dimensional point cloud into a two-dimensional point cloud with a first coordinate system where the two-dimensional image is located as a reference.

The "first" in the first coordinate system is used for distinguishing from a second coordinate system hereinafter, and the first coordinate system and the second coordinate system are coordinate systems based on different reference systems. The first coordinate system is a coordinate system based on a reference system where the two-dimensional image is located, and the second coordinate system is a coordinate system based on a reference system where the three-dimensional point cloud is located.

With reference to the foregoing embodiment, if the two-dimensional image is acquired based on the image acquisition device, the first coordinate system may be a camera coordinate system; the three-dimensional point cloud may be acquired based on radar, and then the second coordinate system may be a radar coordinate system.

Accordingly, this step can be understood as: converting the three-dimensional coordinates of each point in the three-dimensional point cloud based on the radar coordinate system into points of two-dimensional coordinates based on the camera coordinate system, thereby obtaining the two-dimensional point cloud based on the camera coordinate system.

In some embodiments, S203 may include: and projecting the three-dimensional point cloud to a first coordinate system based on the projection matrix to obtain the two-dimensional point cloud.

The projection matrix is used for representing a coordinate conversion relation between the first coordinate system and the second coordinate system, and the second coordinate system is a coordinate system where the three-dimensional point is located.

As shown in fig. 3, after the three-dimensional point cloud is acquired, coordinate conversion is performed to obtain a two-dimensional point cloud.

In combination with the above analysis, in one example, a projection matrix may be constructed in advance, so that when point cloud filtering is required, filtering processing is performed on the three-dimensional point cloud based on the constructed projection matrix.

For example, the method comprises the steps of acquiring the set position information of an image acquisition device on a vehicle, and determining a camera coordinate system according to the set position information; acquiring the set position information of the radar on the vehicle, and determining a radar coordinate system according to the set position information; and determining coordinate deviation information between the camera coordinate system and the radar coordinate system, and determining a coordinate conversion relation between the camera coordinate system and the radar coordinate system according to the coordinate deviation information.

Accordingly, the coordinate based on the radar coordinate system may be converted into the coordinate based on the camera coordinate system, or the coordinate based on the camera coordinate system may be converted into the coordinate based on the radar coordinate system.

In another example, when point cloud filtering is required, a projection matrix may be constructed to filter the three-dimensional point cloud based on the constructed projection matrix.

In this embodiment, the three-dimensional point cloud is converted into the two-dimensional point cloud by combining the projection matrix, so that the effectiveness and reliability of conversion can be improved, and particularly, when the projection matrix is constructed in advance to perform conversion based on the projection matrix, the conversion efficiency can be improved, and the technical effect of saving resources for constructing the projection matrix each time is achieved.

S204: and filtering the two-dimensional point cloud according to the category label to obtain the filtered two-dimensional point cloud.

That is to say, the filtering process for the three-dimensional point cloud can be implemented based on the filtering process for the two-dimensional point cloud, and since the category label is the category label corresponding to the pixel point in the two-dimensional image, and the two-dimensional point cloud and the two-dimensional image are based on the same coordinate system, when the filtering process is performed on the two-dimensional point cloud based on the category label, the reliability and effectiveness of the filtering process can be implemented, and further the technical effects of comprehensiveness and reliability of the filtering process for the three-dimensional point cloud are implemented.

In some embodiments, S204 may include the steps of:

the first step is as follows: and determining the corresponding relation between the pixel point and each two-dimensional point in the two-dimensional point cloud.

The pixel points and the two-dimensional points with the corresponding relation are the same points on the same barrier.

The correspondence between the pixel point and the two-dimensional point may be determined based on a matching manner, such as a similarity comparison manner. And the corresponding relation between the pixel point and the two-dimensional point can be determined based on a coordinate similarity comparison mode.

Illustratively, the number of the pixel points is multiple, and each pixel point and each two-dimensional point have coordinates; and determining the pixel points and the two-dimensional points with the closest distance relationship according to the respective coordinates of the pixel points and the respective coordinates of the two-dimensional points, and determining the pixel points and the two-dimensional points with the closest distance relationship as the same points on the same barrier.

For example, the number of the pixel points is N, the number of the two-dimensional points is M, each pixel point has a coordinate, each two-dimensional point also has a coordinate, for each pixel point in the pixel points, the distance between the coordinate of the pixel point and the coordinate of each two-dimensional point in the M two-dimensional points is calculated, the minimum distance is obtained from each calculated distance, and the pixel point and the two-dimensional point corresponding to the minimum distance are determined as the same point on the same obstacle.

In this embodiment, since the two-dimensional point cloud and the two-dimensional image are in the same coordinate system, the distance between the coordinates of the two-dimensional point and the coordinates of the pixel point is compared to determine the corresponding relationship between the two-dimensional point and the pixel point, so that the corresponding relationship has the technical effects of higher accuracy and reliability.

In some embodiments, the two-dimensional points may be preprocessed before determining the correspondence between the two-dimensional points and the pixel points. The preprocessing may be "rounding processing" or "interpolation processing".

Exemplarily, the rounding process is exemplarily described as follows:

and if the coordinates of the two-dimensional points are non-integer coordinates, converting the coordinates of the two-dimensional points which are non-integer coordinates into integer coordinates.

For example, it is determined whether the coordinates of any two-dimensional point are integer coordinates, that is, whether the value of the coordinates of the two-dimensional point is an integer value, and if not, that is, if the coordinates of the two-dimensional point are decimal coordinate values, the decimal coordinate values are converted into integer coordinate values, that is, the integer coordinates of the two-dimensional point are obtained.

When the three-dimensional point cloud is converted into the two-dimensional point cloud, the situation that the coordinates of two-dimensional points in the two-dimensional point cloud are decimal coordinate values may exist, and the coordinates of pixel points are usually integers, so that the reliability of matching is improved, the coordinates of the two-dimensional points of non-integer values can be converted, the coordinates are matched with the coordinates of the pixel points after being converted into the coordinates of the integer values, namely, the distance is calculated, the calculation accuracy is improved, and the technical effect that the determined corresponding relation between the two-dimensional points and the pixel points has high effectiveness and reliability is achieved.

The following is exemplarily described by taking "interpolation processing" as an example:

and performing two-dimensional point interpolation processing on the two-dimensional point cloud according to the corresponding coordinates of each two-dimensional point, namely inserting more two-dimensional points into the two-dimensional point cloud.

Wherein, an interpolation parameter can be determined based on the sparsity degree between the two-dimensional points, so as to insert more two-dimensional points in the two-dimensional point cloud based on the interpolation parameter. For example, the greater the degree of sparsity, the relatively greater the number of two-dimensional points inserted; conversely, the smaller the degree of sparsity, the smaller the number of two-dimensional points inserted.

Similarly, in this embodiment, the two-dimensional point cloud includes more two-dimensional points through interpolation processing, so that when distance calculation is performed, two-dimensional points closer to the pixel points in the coordinate can be determined, and the technical effect that the determined correspondence between the two-dimensional points and the pixel points has higher effectiveness and reliability is improved.

The second step is as follows: and removing the two-dimensional points in the two-dimensional point cloud according to the category labels and the corresponding relation to obtain the filtered two-dimensional point cloud.

Because the pixel point with the corresponding relation and the two-dimensional point are the same point on the same barrier, the category label of a certain pixel point is equivalent to the category attribute of the two-dimensional point with the corresponding relation with the pixel point.

Illustratively, if the pixel point 1 and the two-dimensional point 1 are the same point on the same obstacle, that is, the pixel point 1 and the two-dimensional point 1 have a corresponding relationship, and correspondingly, the category attribute of the pixel point 1 is the category attribute of the two-dimensional point 1.

For example, if the same obstacle is a pedestrian, the category attribute of the pixel point 1 is a pedestrian, and the category attribute of the two-dimensional point 1 is also a behavior.

Therefore, when the two-dimensional points are filtered by combining the category labels and the corresponding relations, the filtering process can have the technical effects of higher reliability and accuracy.

In some embodiments, the second step may comprise the sub-steps of:

the first substep: and determining the removal probability corresponding to each two-dimensional point according to the class label and the corresponding relation.

Illustratively, a mapping relationship between category labels and removal probabilities may be constructed. With the above embodiment, for example, in the mapping relationship, the category label is a vehicle, and the corresponding removal probability is 0.1; for another example, the class label is a pedestrian, and the corresponding removal probability is 0.2; for another example, the class label is a wall, the corresponding removal probability is 0.9, and so on, which are not listed here.

It should be understood that, in the above mapping relationship, the correspondence between the category label and the removal probability is only used for exemplary illustration, and is not to be understood as a limitation to the mapping relationship, and the mapping relationship may be determined based on a requirement, a history, and a test, and the embodiment is not limited thereto.

For example, a trial probability data set may be configured in advance and a trial may be performed in conjunction with radar attribute information (e.g., the number of lines of the radar, etc.) to determine a probability of removal from the trial probability data set.

The second sub-step: and removing each two-dimensional point according to the removal probability to obtain the filtered two-dimensional point cloud.

As shown in fig. 3, the point cloud processing apparatus includes a Random Die Out (Random Die Out) module, and the category label and the two-dimensional point cloud are input to the Random Die Out module, and the two-dimensional point cloud after filtering is output.

By combining the analysis, the category label of the pixel point can be understood as a category label of the two-dimensional point having a corresponding relationship with the pixel point, accordingly, the removal probability of the two-dimensional point can be determined according to the category label, and the two-dimensional point is removed through the removal probability, so that the effectiveness and reliability of the removal treatment can be improved.

S205: and converting the filtered two-dimensional point cloud into a filtered three-dimensional point cloud with a second coordinate system of the three-dimensional point cloud as a reference.

As shown in fig. 3, the filtered two-dimensional point cloud output by the random extinction module is subjected to coordinate transformation to obtain a filtered three-dimensional point cloud.

Similarly, the filtered two-dimensional point cloud can be converted into the filtered three-dimensional point cloud in a projection matrix-based manner, and the implementation principle of the method is the same as that of converting the three-dimensional point cloud into the two-dimensional point cloud, and is not repeated here.

Fig. 4 is a schematic diagram according to a third embodiment of the present disclosure, and as shown in fig. 4, the point cloud processing method of the embodiment of the present disclosure includes:

s401: and acquiring the three-dimensional point cloud and a two-dimensional image corresponding to the three-dimensional point cloud.

Similarly, in order to avoid redundant description, the technical features of the present embodiment that are the same as those of the above embodiments are not described again in this embodiment.

S402: and performing semantic segmentation processing on the two-dimensional image to obtain a category label of a pixel point in the two-dimensional image.

S403: and filtering the three-dimensional point cloud according to the category label to obtain the filtered three-dimensional point cloud.

S404: and carrying out target detection processing on the filtered three-dimensional point cloud to obtain obstacle information corresponding to the filtered three-dimensional point cloud.

In some embodiments, the filtered three-dimensional point cloud may be subjected to target detection processing based on a pre-trained target detection network model, so as to obtain obstacle information corresponding to the filtered three-dimensional point cloud.

Similarly, the target detection network model may also be obtained based on sample data training, and this embodiment of the specific method for training the target detection network model is not limited.

As shown in fig. 3, the filtered three-dimensional point cloud is input to the target detection network model, and a detection result (i.e., obstacle information) is output. The detection result may be a result of the obstacle being a pedestrian, a vehicle, or the like.

By combining the analysis, the filtered three-dimensional point cloud has higher reliability and accuracy, and relatively speaking, irrelevant noise points are filtered, so that when the obstacle information is determined based on the filtered three-dimensional point cloud, the determined obstacle information has the technical effects of higher accuracy and reliability.

It should be understood that the execution body of S404 may be the same as or different from the execution bodies of S401-S403. For example, taking the execution subject of S404 and the execution subjects of S401-S403 as different execution subjects as an example:

the executing body of the step S404 may be a detecting device, both the detecting device and the point cloud processing device are disposed in the vehicle, the point cloud processing device transmits the filtered three-dimensional point cloud to the detecting device, and the detecting device determines the obstacle information based on the filtered three-dimensional point cloud.

S405: and controlling automatic driving of the vehicle according to the obstacle information.

Similarly, the executing body of S405 may be a control device, and the control device may receive the obstacle information transmitted by the detection device and control the automatic driving of the vehicle according to the obstacle information.

For example, after the control device receives the obstacle information, the driving strategy of the vehicle may be adjusted or determined according to the obstacle information to control the automatic driving of the vehicle based on the driving strategy.

For example, as shown in fig. 5, if the obstacle information represents the related information (such as speed and position) of the obstacle vehicle B in front of the vehicle a, the control device may control the vehicle a to travel in a lane change manner, may control the vehicle a to travel at a reduced speed, and the like, which are not listed here.

Similarly, in this embodiment, since the filtered three-dimensional point cloud has higher reliability and accuracy, and relatively speaking, the noise-free point has been filtered, the obstacle information determined based on the filtered three-dimensional point cloud has higher accuracy and reliability, and therefore, when the automatic driving of the vehicle is controlled based on the obstacle information, the technical effect of the safety of vehicle control can be improved.

Fig. 6 is a schematic diagram according to a fourth embodiment of the present disclosure, and as shown in fig. 6, a self-point cloud processing apparatus 600 of the embodiment of the present disclosure includes:

an obtaining unit 601 is configured to obtain a three-dimensional point cloud and a two-dimensional image corresponding to the three-dimensional point cloud.

The segmentation unit 602 is configured to perform semantic segmentation processing on the two-dimensional image to obtain a category label of a pixel point in the two-dimensional image, where the category label is used to represent a category of an obstacle corresponding to the pixel point.

And the filtering unit 603 is configured to filter the three-dimensional point cloud according to the category label to obtain a filtered three-dimensional point cloud.

Fig. 7 is a schematic diagram according to a fifth embodiment of the present disclosure, and as shown in fig. 7, a point cloud processing apparatus 700 of the embodiment of the present disclosure includes:

an obtaining unit 701 is configured to obtain a three-dimensional point cloud and a two-dimensional image corresponding to the three-dimensional point cloud.

The segmentation unit 702 is configured to perform semantic segmentation processing on the two-dimensional image to obtain a category label of a pixel point in the two-dimensional image, where the category label is used to represent a category of an obstacle corresponding to the pixel point.

In some embodiments, the segmentation unit 702 is configured to perform semantic segmentation processing on the two-dimensional image based on a pre-trained semantic segmentation model to obtain a category label of a pixel point in the two-dimensional image, where the semantic segmentation model is generated based on a sample data set, and the sample data set includes a plurality of sample two-dimensional images.

And the filtering unit 703 is configured to filter the three-dimensional point cloud according to the category label to obtain a filtered three-dimensional point cloud.

As can be seen in fig. 7, in some embodiments, the filtering unit 703 includes:

the first converting subunit 7031 is configured to convert the three-dimensional point cloud into a two-dimensional point cloud with a first coordinate system as a reference where the two-dimensional image is located.

In some embodiments, the first converting subunit 7031 is configured to project the three-dimensional point cloud to a first coordinate system based on a projection matrix to obtain a two-dimensional point cloud, where the projection matrix is used to represent a coordinate conversion relationship between the first coordinate system and a second coordinate system, and the second coordinate system is a coordinate system where the three-dimensional point is located.

And the filtering subunit 7032 is configured to filter the two-dimensional point cloud according to the category label to obtain a filtered two-dimensional point cloud.

In some embodiments, filtering subunit 7032 includes:

and the determining module is used for determining the corresponding relation between the pixel point and each two-dimensional point in the two-dimensional point cloud, wherein the pixel point with the corresponding relation and the two-dimensional point are the same point on the same obstacle.

In some embodiments, the number of the pixel points is multiple, and each pixel point and each two-dimensional point have coordinates; the determining module is used for determining the pixel points and the two-dimensional points with the closest distance relationship according to the respective corresponding coordinates of the pixel points and the respective corresponding coordinates of the two-dimensional points, and determining the pixel points and the two-dimensional points with the closest distance relationship as the same points on the same barrier.

And the removing module is used for removing the two-dimensional points in the two-dimensional point cloud according to the category labels and the corresponding relation to obtain the filtered two-dimensional point cloud.

In some embodiments, a removal module comprises:

and the determining submodule is used for determining the removal probability corresponding to each two-dimensional point according to the category label and the corresponding relation.

In some embodiments, the number of the pixel points is multiple; and the determining submodule is used for determining the removal probability of the two-dimensional point which has a corresponding relation with each pixel point according to the type label of each pixel point.

And the removing submodule is used for removing each two-dimensional point according to the removing probability to obtain the filtered two-dimensional point cloud.

In some embodiments, the two-dimensional points have coordinates; the filtering subunit further comprises:

and the conversion module is used for converting the coordinates of the two-dimensional points which are non-integer coordinates into integer coordinates if the coordinates of the two-dimensional points are the non-integer coordinates.

In other embodiments, the two-dimensional points have coordinates; the filtering subunit further comprises:

and the interpolation module is used for carrying out two-dimensional point interpolation processing on the two-dimensional point cloud according to the corresponding coordinates of each two-dimensional point.

A second converting subunit 7033, configured to convert the filtered two-dimensional point cloud to a filtered three-dimensional point cloud with a second coordinate system of the three-dimensional point cloud.

And the detection unit 704 is configured to perform target detection processing on the filtered three-dimensional point cloud to obtain obstacle information corresponding to the filtered three-dimensional point cloud.

A control unit 705 for controlling the automatic driving of the vehicle according to the obstacle information.

According to another aspect of the embodiments of the present disclosure, there is also provided a vehicle including the apparatus according to any one of the embodiments, such as the apparatus shown in fig. 6 or fig. 7.

In some embodiments, a radar and image capture device are provided on the vehicle, wherein,

and the radar is used for acquiring the three-dimensional point cloud.

And the image acquisition device is used for acquiring a two-dimensional image corresponding to the three-dimensional point cloud.

Fig. 8 is a schematic diagram according to a sixth embodiment of the present disclosure, and as shown in fig. 8, an electronic device 800 in the present disclosure may include: a processor 801 and a memory 802.

A memory 802 for storing programs; the Memory 802 may include a volatile Memory (RAM), such as a Static Random Access Memory (SRAM), a Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM), and the like; the memory may also comprise a non-volatile memory, such as a flash memory. The memory 802 is used to store computer programs (e.g., applications, functional modules, etc. that implement the above-described methods), computer instructions, etc., which may be stored in one or more of the memories 802 in a partitioned manner. And the above-described computer programs, computer instructions, data, and the like can be called by the processor 801.

The computer programs, computer instructions, etc. described above may be stored in one or more memories 802 in partitions. And the above-mentioned computer program, computer instruction, or the like can be called by the processor 801.

A processor 801 for executing the computer program stored in the memory 802 to implement the steps of the method according to the above embodiments.

Reference may be made in particular to the description relating to the preceding method embodiment.

The processor 801 and the memory 802 may be separate structures or may be integrated structures integrated together. When the processor 801 and the memory 802 are separate structures, the memory 802 and the processor 801 may be coupled by a bus 803.

The electronic device of this embodiment may execute the technical solution in the method, and the specific implementation process and technical principle are the same, which are not described herein again.

In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information (such as the position information of the pedestrian) of the related user all meet the regulations of related laws and regulations, and do not violate the good custom of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

According to an embodiment of the present disclosure, the present disclosure also provides a computer program product comprising: a computer program, stored in a readable storage medium, from which at least one processor of the electronic device can read the computer program, the at least one processor executing the computer program causing the electronic device to perform the solution provided by any of the embodiments described above.

FIG. 9 illustrates a schematic block diagram of an example electronic device 900 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 9, the apparatus 900 includes a computing unit 901, which can perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)902 or a computer program loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the device 900 can also be stored. The calculation unit 901, ROM 902, and RAM 903 are connected to each other via a bus 904. An input/output (I/O) interface 905 is also connected to bus 904.

A number of components in the device 900 are connected to the I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, and the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, optical disk, or the like; and a communication unit 909 such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 901 performs the respective methods and processes described above, such as a point cloud processing method. For example, in some embodiments, the point cloud processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 900 via ROM 902 and/or communications unit 909. When the computer program is loaded into the RAM 903 and executed by the computing unit 901, one or more steps of the point cloud processing method described above may be performed. Alternatively, in other embodiments, the computing unit 901 may be configured to perform the point cloud processing method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A method of point cloud processing, the method comprising:

2. The method of claim 1, wherein filtering the three-dimensional point cloud according to the category label to obtain a filtered three-dimensional point cloud comprises:

converting the three-dimensional point cloud into a two-dimensional point cloud with a first coordinate system where the two-dimensional image is located as a reference;

and filtering the two-dimensional point cloud according to the category label to obtain a filtered two-dimensional point cloud, and converting the filtered two-dimensional point cloud into a filtered three-dimensional point cloud with a second coordinate system where the three-dimensional point cloud is located as a reference.

3. The method of claim 2, wherein filtering the two-dimensional point cloud according to the category label to obtain a filtered two-dimensional point cloud comprises:

determining a corresponding relation between the pixel point and each two-dimensional point in the two-dimensional point cloud, wherein the pixel point and the two-dimensional point with the corresponding relation are the same point on the same obstacle;

and removing the two-dimensional points in the two-dimensional point cloud according to the category labels and the corresponding relation to obtain the filtered two-dimensional point cloud.

4. The method of claim 3, wherein removing each two-dimensional point in the two-dimensional point cloud according to the category label and the correspondence to obtain the filtered two-dimensional point cloud comprises:

determining the removal probability corresponding to each two-dimensional point according to the category label and the corresponding relation;

and removing each two-dimensional point according to the removal probability to obtain the filtered two-dimensional point cloud.

5. The method of claim 4, wherein the number of the pixel points is plural; determining the removal probability corresponding to each two-dimensional point according to the category label and the corresponding relation, wherein the removal probability comprises the following steps:

and determining the removal probability of the two-dimensional point corresponding to each pixel point according to the type label of each pixel point.

6. The method of any of claims 3-5, wherein the two-dimensional points have coordinates; before determining the correspondence between the pixel points and the two-dimensional points in the two-dimensional point cloud, the method further comprises:

7. The method of any of claims 3-5, wherein the two-dimensional points have coordinates; before determining the correspondence between the pixel points and the two-dimensional points in the two-dimensional point cloud, the method further comprises:

and carrying out two-dimensional point interpolation processing on the two-dimensional point cloud according to the corresponding coordinates of each two-dimensional point.

8. The method according to any one of claims 3-7, wherein the number of the pixel points is plural, each pixel point and each two-dimensional point having coordinates; determining a corresponding relationship between the pixel point and each two-dimensional point in the two-dimensional point cloud, including:

and determining the pixel points and the two-dimensional points with the closest distance relationship according to the respective coordinates of the pixel points and the respective coordinates of the two-dimensional points, and determining the pixel points and the two-dimensional points with the closest distance relationship as the same points on the same barrier.

9. The method of any of claims 2-8, wherein converting the three-dimensional point cloud to a two-dimensional point cloud referenced to a first coordinate system in which the two-dimensional image is located comprises:

and projecting the three-dimensional point cloud to the first coordinate system based on a projection matrix to obtain the two-dimensional point cloud, wherein the projection matrix is used for representing a coordinate conversion relation between the first coordinate system and a second coordinate system, and the second coordinate system is a coordinate system where the three-dimensional point is located.

10. The method according to any one of claims 1 to 9, wherein performing semantic segmentation processing on the two-dimensional image to obtain a class label of a pixel point in the two-dimensional image comprises:

performing semantic segmentation processing on the two-dimensional image based on a pre-trained semantic segmentation model to obtain a category label of a pixel point in the two-dimensional image, wherein the semantic segmentation model is generated based on a sample data set, and the sample data set comprises a plurality of sample two-dimensional images.

11. The method of any of claims 1-10, after filtering the three-dimensional point cloud according to the category label to obtain a filtered three-dimensional point cloud, further comprising:

and carrying out target detection processing on the filtered three-dimensional point cloud to obtain obstacle information corresponding to the filtered three-dimensional point cloud.

12. The method of claim 11, after performing target detection processing on the filtered three-dimensional point cloud to obtain obstacle information corresponding to the filtered three-dimensional point cloud, the method further comprising:

and controlling automatic driving of the vehicle according to the obstacle information.

13. A point cloud processing apparatus, the apparatus comprising:

14. The apparatus of claim 13, wherein the filter unit comprises:

the first conversion subunit is used for converting the three-dimensional point cloud into a two-dimensional point cloud with a first coordinate system of the two-dimensional image as a reference;

the filtering subunit is used for filtering the two-dimensional point cloud according to the category label to obtain a filtered two-dimensional point cloud;

and the second conversion subunit is used for converting the filtered two-dimensional point cloud into a filtered three-dimensional point cloud with a second coordinate system of the three-dimensional point cloud as a reference.

15. The apparatus of claim 14, wherein the filtering subunit comprises:

the determining module is used for determining the corresponding relation between the pixel point and each two-dimensional point in the two-dimensional point cloud, wherein the pixel point with the corresponding relation and the two-dimensional point are the same point on the same obstacle;

16. The apparatus of claim 15, wherein the removal module comprises:

the determining submodule is used for determining the removal probability corresponding to each two-dimensional point according to the category label and the corresponding relation;

17. The apparatus of claim 16, wherein the number of pixel points is plural; and the determining submodule is used for determining the removal probability of the two-dimensional point which has a corresponding relation with each pixel point according to the type label of each pixel point.

18. The apparatus of any one of claims 15-17, wherein the two-dimensional points have coordinates; the filtering sub-unit further comprises:

19. The apparatus of any one of claims 15-17, wherein the two-dimensional points have coordinates; the filtering sub-unit further comprises:

and the interpolation module is used for carrying out two-dimensional point interpolation processing on the two-dimensional point cloud according to the respective corresponding coordinates of each two-dimensional point.

20. The apparatus according to any one of claims 15-19, wherein the number of the pixel points is plural, each pixel point and each two-dimensional point having coordinates; the determining module is used for determining the pixel points and the two-dimensional points with the closest distance relationship according to the respective corresponding coordinates of the pixel points and the respective corresponding coordinates of the two-dimensional points, and determining the pixel points and the two-dimensional points with the closest distance relationship as the same points on the same barrier.

21. The apparatus according to any one of claims 14-20, wherein the first transformation subunit is configured to project the three-dimensional point cloud to the first coordinate system based on a projection matrix to obtain the two-dimensional point cloud, wherein the projection matrix is configured to represent a coordinate transformation relationship between the first coordinate system and a second coordinate system, and the second coordinate system is a coordinate system in which the three-dimensional point is located.

22. The apparatus according to any one of claims 13 to 21, wherein the segmentation unit is configured to perform semantic segmentation processing on the two-dimensional image based on a pre-trained semantic segmentation model to obtain a class label of a pixel point in the two-dimensional image, where the semantic segmentation model is generated based on a sample data set, and the sample data set includes a plurality of sample two-dimensional images.

23. The apparatus of any one of claims 13-22, the apparatus further comprising:

and the detection unit is used for carrying out target detection processing on the filtered three-dimensional point cloud to obtain obstacle information corresponding to the filtered three-dimensional point cloud.

24. The apparatus of claim 23, the apparatus further comprising:

and the control unit is used for controlling the automatic driving of the vehicle according to the obstacle information.

25. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-12.

26. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-12.

27. A computer program product comprising a computer program which, when executed by a processor, carries out the steps of the method of any one of claims 1 to 12.

28. A vehicle, comprising: the apparatus of any one of claims 13-24.

29. The vehicle of claim 28, wherein the vehicle further comprises:

the radar is used for acquiring a three-dimensional point cloud;