CN114556419A

CN114556419A - Three-dimensional point cloud segmentation method and device and movable platform

Info

Publication number: CN114556419A
Application number: CN202080070567.XA
Authority: CN
Inventors: 李星河; 葛宏斌; 邱凡
Original assignee: SZ DJI Technology Co Ltd
Current assignee: SZ DJI Technology Co Ltd
Priority date: 2020-12-15
Filing date: 2020-12-15
Publication date: 2022-05-27
Also published as: WO2022126380A1

Abstract

The embodiment of the disclosure provides a three-dimensional point cloud segmentation method and device and a movable platform, which are used for carrying out point cloud segmentation on three-dimensional point cloud collected by the movable platform, and the method comprises the following steps: based on a motion hypothesis in a pre-established motion hypothesis model, projecting a plurality of frames of three-dimensional point clouds collected by the movable platform to a preset coordinate system, and acquiring a projection density corresponding to the motion hypothesis; determining a matching motion hypothesis from the plurality of motion hypotheses based on projection densities corresponding to the plurality of motion hypotheses; and performing point cloud segmentation on a first three-dimensional point cloud in the plurality of frames of three-dimensional point clouds based on the matching motion hypothesis.

Description

Three-dimensional point cloud segmentation method and device and movable platform

Technical Field

The disclosure relates to the technical field of computer vision, in particular to a three-dimensional point cloud segmentation method and device and a movable platform.

Background

During the driving process of the movable platform, the driving state (for example, pose and speed) of the movable platform can be decision-making planned through a path planning (planning) module on the movable platform. In order to enable the planning module to complete decision planning, a point cloud acquisition device on a movable platform is required to acquire a three-dimensional point cloud of a surrounding environment, perform point cloud segmentation to distinguish the ground and obstacles in the three-dimensional point cloud, and further distinguish a dynamic object and a static object from the obstacles. Therefore, point cloud segmentation is an important link for decision planning of the driving state of the movable platform.

In a conventional point cloud segmentation method, point clouds are generally identified to determine categories to which the point clouds belong, and then the point clouds which may move are determined based on the categories to which the point clouds belong, and the point clouds which may move are tracked, so that the moving point clouds and the static point clouds are distinguished. However, this approach has low reliability for point cloud segmentation.

Disclosure of Invention

In view of this, embodiments of the present disclosure provide a three-dimensional point cloud segmentation method and apparatus, and a movable platform, so as to reliably perform point cloud segmentation on three-dimensional point clouds of various objects.

According to a first aspect of the embodiments of the present disclosure, there is provided a three-dimensional point cloud segmentation method for performing point cloud segmentation on a three-dimensional point cloud acquired by a movable platform, the method including: based on a motion hypothesis in a pre-established motion hypothesis model, projecting a plurality of frames of three-dimensional point clouds collected by the movable platform to a preset coordinate system, and acquiring a projection density corresponding to the motion hypothesis; determining a matching motion hypothesis from the plurality of motion hypotheses based on projection densities corresponding to the plurality of motion hypotheses; and performing point cloud segmentation on a first three-dimensional point cloud in the plurality of frames of three-dimensional point clouds based on the matching motion hypothesis.

According to a second aspect of the embodiments of the present disclosure, there is provided a three-dimensional point cloud segmentation apparatus, including a processor, configured to perform point cloud segmentation on a three-dimensional point cloud acquired by a movable platform, the processor being configured to perform the following steps: based on a motion hypothesis in a pre-established motion hypothesis model, projecting a plurality of frames of three-dimensional point clouds collected by the movable platform to a preset coordinate system, and acquiring a projection density corresponding to the motion hypothesis; determining a matching motion hypothesis from the plurality of motion hypotheses based on projection densities corresponding to the plurality of motion hypotheses; and performing point cloud segmentation on a first three-dimensional point cloud in the plurality of frames of three-dimensional point clouds based on the matching motion hypothesis.

According to a third aspect of embodiments of the present disclosure, there is provided a movable platform comprising: a housing; the point cloud acquisition device is arranged on the shell and used for acquiring three-dimensional point cloud; and a three-dimensional point cloud segmentation device arranged in the shell and used for executing the method of any embodiment of the disclosure.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method according to any of the embodiments of the present disclosure.

By applying the scheme of the embodiment of the disclosure, a motion hypothesis is established for the three-dimensional point cloud, the collected multi-frame three-dimensional point cloud is projected to a preset coordinate system based on the motion hypothesis, namely, the motion process of the three-dimensional point cloud is simulated based on the motion hypothesis, and then whether the established motion hypothesis is the same as the real motion mode of the three-dimensional point cloud is judged based on the projection density corresponding to the motion hypothesis, so that the motion hypothesis is determined to be matched, and then point cloud segmentation is performed based on the matching motion hypothesis. According to the method, the category to which the three-dimensional point cloud belongs does not need to be identified, and training data driving is not needed, so that point cloud segmentation can be performed on the three-dimensional point cloud in any form, and the reliability of point cloud segmentation is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of a point cloud segmentation process of some embodiments.

Fig. 2 is a schematic diagram of a decision planning process during travel of a movable platform according to some embodiments.

Fig. 3 is a flowchart of a point cloud segmentation method according to an embodiment of the present disclosure.

Fig. 4 is a schematic diagram of a motion hypothesis of an embodiment of the disclosure.

Fig. 5A is a schematic diagram of a grid weight graph of an embodiment of the disclosure.

Fig. 5B is a schematic diagram of a mask diagram of an embodiment of the present disclosure.

Fig. 6 is an overall flow diagram of a point cloud segmentation process of an embodiment of the present disclosure.

Fig. 7 is a schematic diagram of a point cloud segmentation apparatus of an embodiment of the present disclosure.

Fig. 8 is a schematic view of a movable platform of an embodiment of the disclosure.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The terminology used in the disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

In the driving process of the movable platform, decision planning can be carried out on the driving state of the movable platform through a path planning (planning) module on the movable platform. The point cloud segmentation is an important link for decision planning of the driving state of the movable platform. Fig. 1 is a schematic diagram of a point cloud segmentation process according to some embodiments. In step 101, a three-dimensional point cloud may be collected by a point cloud collection device on a movable platform, and then, in step 102, for the movable platform (e.g., an unmanned vehicle) traveling on the ground, ground segmentation may be performed on the collected three-dimensional point cloud, i.e., three-dimensional points in the three-dimensional point cloud are segmented into ground points and non-ground points. For other types of movable platforms (e.g., movable robots), the acquired three-dimensional point cloud may be segmented to segment three-dimensional points in the three-dimensional point cloud into points on the movable platform travel surface and points not on the movable platform travel surface. For convenience of description, the following description will be made with a travel surface as a ground surface. In step 103, if a three-dimensional point is a ground point, step 104 is executed to add a ground point label to the three-dimensional point, otherwise step 105 is executed to perform dynamic and static segmentation on the three-dimensional point, that is, the three-dimensional point is segmented into a static point which is static and a dynamic point which moves. In step 106, if a three-dimensional point is a static point, step 107 is performed, a static point tag is added to the three-dimensional point, otherwise step 108 is performed, a dynamic point tag is added to the three-dimensional point, and the three-dimensional point cloud with the tag is output to a downstream module in step 109. All or part of the three-dimensional points in the three-dimensional point cloud can be labeled. The label may include at least one of a first label for characterizing whether the three-dimensional point is a ground point and a second label for characterizing whether the three-dimensional point is a static point, and may further include a label for characterizing other information of the three-dimensional point.

The downstream module may be a planning module on a movable platform, such as an Electronic Control Unit (ECU), a Central Processing Unit (CPU), and the like. The Planning module can perform decision Planning on the driving state of the movable platform based on the label of the three-dimensional point after receiving the labeled three-dimensional point cloud. The driving state may include at least one of a pose and a speed of the movable platform. Fig. 2 is a schematic diagram of a decision planning process according to some embodiments. In step 201 and step 202, the planning module may receive the three-dimensional point cloud and read a tag carried in the three-dimensional point cloud. In step 203, it may be determined whether a three-dimensional point in the three-dimensional point cloud is a point on a moving platform travel surface (e.g., ground) based on the tag. Taking the ground points as an example, if yes, step 204 is executed to identify three-dimensional points belonging to the lane line from the ground points, and determine the attitude of the movable platform according to the direction of the lane line, so that the movable platform runs along the direction of the lane line. If the non-ground point is a non-ground point, step 205 is executed to determine whether the non-ground point is a static point. If so, step 206 is performed to determine the pose of the movable platform from the position of the static point. For example, whether the static point is on a pre-planned driving path is judged, and if so, the path is re-planned to avoid the collision of the movable platform and the static point. If the non-ground point is a dynamic point, step 207 is performed to determine at least one of the attitude and the velocity of the movable platform based on the orientation and the velocity of the static point. For example, if the dynamic point is on a driving path planned in advance by the movable platform and the moving speed of the dynamic point is less than or equal to the moving speed of the movable platform, the movable platform is controlled to decelerate, or the attitude of the movable platform is adjusted, so that the movable platform bypasses the dynamic point. Also for example, the movable platform may be controlled to travel at the same speed as the dynamic point.

Therefore, point cloud segmentation is an important link for decision planning of the driving state of the movable platform, and accurate point cloud segmentation is beneficial to accurate decision planning of the driving state of the movable platform. In a conventional point cloud segmentation method, point clouds are generally detected and identified to determine categories to which the point clouds belong, and then the point clouds which are likely to move are determined based on the categories to which the point clouds belong, and the point clouds which are likely to move are tracked, so that the moving point clouds and the static point clouds are distinguished. This approach is called detection-based (detection) point cloud segmentation approach.

Currently, there are two main methods for point cloud segmentation based on detection. The other method is target detection in an image space, a detection result is expressed as a two-dimensional bounding box, and finally, three-dimensional point cloud is projected to an image to judge whether the three-dimensional point cloud is in the two-dimensional bounding box of the image. And the other method is target detection in a point cloud space, the detection result is expressed as a three-dimensional bounding box, and whether the three-dimensional point cloud is in the detected three-dimensional bounding box is directly judged in the three-dimensional space. However, both of the above-mentioned approaches are data-driven, and both require training a detection model through a training set and performing target detection through the detection model. When objects such as special-shaped vehicles and the like outside the training set are encountered, the detection model is often invalid, and therefore the reliability of point cloud segmentation is influenced. In addition, the detection mode based on the image needs to additionally depend on the image for detection, the method is not suitable for being used by a laser radar intelligent sensor, and once the camera and the laser radar are arranged at a far position, foreground and background deviation can occur in projection when an object is shielded, and the problem of the foreground and background deviation is particularly obvious in the near place. The accuracy of the space-based detection mode is generally lower than that of the image-based detection mode, and the detection mode is particularly obvious in a region where the point cloud is sparse at a far place. In summary, the conventional point cloud segmentation method has low reliability.

Based on this, the present disclosure provides a three-dimensional point cloud segmentation method for performing point cloud segmentation on a three-dimensional point cloud acquired by a movable platform, as shown in fig. 3, the method includes:

step 301: based on a motion hypothesis in a pre-established motion hypothesis model, projecting a plurality of frames of three-dimensional point clouds collected by the movable platform to a preset coordinate system, and acquiring a projection density corresponding to the motion hypothesis;

step 302: determining a matching motion hypothesis from the plurality of motion hypotheses based on projection densities corresponding to the plurality of motion hypotheses;

step 303: and performing point cloud segmentation on a first three-dimensional point cloud in the plurality of frames of three-dimensional point clouds based on the matching motion hypothesis.

The present disclosure utilizes Multiple Hypothesis Tracking (MHT) technology, which establishes a potential Tracking Hypothesis tree for each candidate target, and then calculates the probability of each Tracking to select the most likely Tracking combination. The method only depends on the three-dimensional point cloud for point cloud segmentation, and does not need to detect the category of the three-dimensional point cloud, so that the method does not depend on image target detection, and does not have the requirement of aligning with the origin of an image coordinate system to reduce the shielding deviation; because the method does not depend on a data driving method, the risk of missing detection of the special-shaped object outside the data training set does not exist; the method disclosed by the invention has low requirement on point cloud density, so that relatively accurate point cloud segmentation can be realized in a remote sparse point cloud area.

The method can process each three-dimensional point in the three-dimensional point cloud collected by the point cloud collection device to carry out point cloud segmentation on each three-dimensional point, and can also carry out pre-segmentation (also called ground segmentation) on the collected three-dimensional point cloud to determine the three-dimensional points on the traveling road surface of the movable platform and the three-dimensional points outside the traveling road surface of the movable platform, and then carry out point cloud segmentation on the three-dimensional points outside the traveling road surface of the movable platform. Wherein the driving road surface can be the ground on which a vehicle drives or the glass plane on which a mobile robot drives. The pre-segmentation can be realized by using RANSAC ground model fitting and the like, which is not limited by the present disclosure. For the latter case, after the pre-segmentation, a first label may be added to each three-dimensional point in the three-dimensional point clouds to indicate three-dimensional points in each frame of the three-dimensional point clouds, which are outside the traveling road surface of the movable platform, and then only the three-dimensional points in each frame of the three-dimensional point clouds, which carry the first label, are projected to a preset coordinate system. For three-dimensional points that do not carry the first label, no processing may be performed.

In some embodiments, because the frequency of the point cloud collection device collecting the three-dimensional point cloud is high, the motion of three-dimensional points in two adjacent frames of three-dimensional point clouds is not significant enough, and multiple frames of original three-dimensional point clouds collected by the movable platform can be subjected to frequency division processing to obtain multiple frames of three-dimensional point clouds needing point cloud segmentation. Therefore, on one hand, the motion significance of the three-dimensional points in each frame of three-dimensional point cloud subjected to point cloud segmentation can be improved, on the other hand, the computational power consumption can be reduced, and the system resources are saved. Alternatively, a divide-by-two frequency may be employed. For example, the timestamp of the three-dimensional point cloud may be modulo 200ms, and if the result is 0, the frame of three-dimensional point cloud may be point cloud segmented, and if the result is not 0, the frame of three-dimensional point cloud may not be point cloud segmented. Since the determination of the motion attributes cannot be done with a single frame of point clouds, it is necessary to determine which point clouds are moving in time series. Therefore, the multi-frame three-dimensional point cloud which needs to be subjected to point cloud segmentation is added into the point cloud queue, and a matching motion hypothesis is determined for the multi-frame three-dimensional point cloud in the queue, so that point cloud segmentation is performed. Further, in order to improve the significance of the observation evidence, the point cloud segmentation process of the present disclosure may be executed after the point cloud queue is accumulated for a certain period of time (e.g., 3 seconds). And if the point cloud queue is not accumulated for a certain time, continuing accumulation.

In step 301, a three-dimensional point cloud may be acquired by a point cloud acquisition device (e.g., a lidar, a vision sensor, etc.) on a movable platform. The movable platform can be an unmanned vehicle, an unmanned aerial vehicle, an unmanned ship, a movable robot and the like. The preset coordinate system may be a current vehicle body coordinate system of the movable platform, and the coordinate system takes a current position of the movable platform as a coordinate origin. Alternatively, the predetermined coordinate system may be a world coordinate system or another coordinate system selected in advance.

The motion hypothesis model is used for making an assumption about the motion velocity of the movable platform, and one motion hypothesis model may include one or more motion hypotheses, each motion hypothesis may correspond to one motion velocity vector, that is, different motion hypotheses may have different motion velocity magnitudes and/or motion directions. FIG. 4 is a schematic diagram of a motion hypothesis model according to some embodiments. Wherein each arrowed ray represents a motion hypothesis, the length of the ray represents the magnitude of the velocity, and the direction of the ray represents the direction of the velocity. For example, rays 401 to 404 of the first quadrant represent a velocity direction of 0 ° to 90 °, and a velocity magnitude of positive; rays 405 through 409 in the second quadrant indicate a velocity direction of-90 ° to 0 °, and a velocity magnitude of positive; the ray in the third quadrant (not shown) indicates a velocity direction of-90 ° to 0 ° and a negative magnitude; ray 410 in the fourth quadrant represents a velocity in the direction of 0 deg. to 90 deg., and a velocity of negative magnitude. Where the ray 407 and the ray 408 have the same direction but different lengths, they represent two motion hypotheses with the same velocity direction but different velocity magnitudes. In the case where the movable platform is a vehicle, since the vehicle is moving on a flat ground, the movement in the vertical direction is considered to be a speed of 0 in all assumptions. In the case where the movable platform is a movable robot or an unmanned aerial vehicle, etc., the movement speed in the vertical direction may not be 0.

Those skilled in the art will understand that the motion hypothesis model in the above embodiments is merely an exemplary illustration, and in practical applications, the number of motion hypotheses included in the motion hypothesis model and the corresponding speed direction and size of each motion hypothesis may be determined according to practical needs (e.g., accuracy requirement of point cloud segmentation, system computation force, etc.), and the disclosure does not limit this. In some embodiments, at least one of the range of motion velocities and the range of motion directions for each motion hypothesis in the motion hypothesis model may be determined based on characteristics of the environment in which the movable platform is located. The environmental characteristics may include characteristics for characterizing the type of environment (e.g., urban environment, highway environment), for characterizing the ambient lighting conditions (e.g., day, night), and/or for characterizing the ambient climate (e.g., sunny day, fog, snow storms). The environmental characteristics may be determined based on road semantic information collected by the movable platform, location information of the movable platform, information received by the movable platform, and the like. In case the environment characteristics of the environment in which the movable platform is located are different, the velocity and/or direction of the motion hypotheses comprised in the motion hypothesis model employed may also be different. For example, in harsh weather environments such as fog, snow storms, etc., the assumed speed of motion is generally small. Also for example, in a highway environment, the range of speeds is generally small. In other embodiments, the range of speeds and directions of the motion hypothesis may also be determined according to the type of movable platform (e.g., vehicle, drone, movable robot, etc.).

In some embodiments, each motion in the motion hypothesis model assumes a motion speed in the movable platform travel direction in the range of [ -40m/s, 40m/s ], and a motion speed in a direction perpendicular to the movable platform travel direction in the range of [ -10m/s, 10m/s ]; in other embodiments, the motion direction of each motion hypothesis in the motion hypothesis model is within [ -90 °, 90 °). The above range can cover most motion scenes when the movable platform is a vehicle, and of course, different motion assumptions can be adopted in different scenes.

It should be noted that, in practical applications, the driving process of the movable platform may be divided into several segments according to time, and when the time corresponding to each segment is short enough (for example, less than or equal to 3 seconds), the moving process of the movable platform in each time segment may be regarded as a uniform linear motion. Under the uniform linear motion model, the lateral velocity and the longitudinal velocity of the movable platform are sampled at certain time intervals, so as to obtain the motion hypothesis model shown in fig. 4. In other cases, the motion process of the movable platform can also be simulated by adopting other motion hypothesis models such as a uniform acceleration motion model and a uniform deceleration motion model.

After the motion hypothesis model is established, based on a motion hypothesis in the motion hypothesis model, the multiple frames of three-dimensional point clouds collected by the movable platform are projected to a preset coordinate system, and a projection density corresponding to the motion hypothesis is obtained. Further, the multi-frame three-dimensional point cloud can be projected to a preset coordinate system based on the motion hypothesis and the pose of the movable platform when the movable platform collects each frame of three-dimensional point cloud in the multi-frame three-dimensional point cloud:

wherein, P_i-n,kRepresents the position of the projection point of the i-n frames of three-dimensional point cloud under the k motion hypothesis, P_i-nThe preset coordinate system is a vehicle body coordinate system and an odom when the ith frame of three-dimensional point cloud is collected, and is the position of the three-dimensional point in the ith-nth frame of three-dimensional point cloud_i-nAnd odom_iRespectively representing the pose of the movable platform when the i-n frames of three-dimensional point clouds are acquired and the pose of the movable platform when the i-th frame of three-dimensional point clouds is acquired, delta t representing the time interval between the acquisition of the i-n frames of three-dimensional point clouds and the acquisition of the i-th frame of three-dimensional point clouds, v_hRepresenting the velocity corresponding to the kth motion hypothesis, and-1 represents the matrix inversion operation. The above process passes from_i-nWill P_i-nConversion to world coordinate system and then passing

Converting points in the world coordinate system to a preset coordinate system and passing through n, delta, t, v_hCompensating the self-movement of the movable platform, thereby obtaining the position of the projection point. For the condition that the movable platform is a vehicle, the three-dimensional point cloud and the odometry data (namely pose data of the vehicle) with the same timestamp can be synchronously received by utilizing a message synchronization mechanism commonly used in a vehicle-mounted system, so that position and pose references are provided for projection among multiple frames of three-dimensional point clouds, and deviation caused by self motion of the vehicle is compensated. The above transformation can be done separately for all three-dimensional point cloud frames and all motion hypotheses, thereby completing the injection of motion hypotheses.

Ideally, each projected point corresponds to the same point if the motion assumes the same way as the true motion of the movable platform. In practical situations, due to the fact that the motion process of the movable platform is not completely equivalent to the uniform linear model, and due to noise and differences between the motion assumption and the motion mode of the movable platform, there may be some deviation between the projection points, but as long as the motion assumption is close enough to the motion mode of the movable platform, the deviation between the projection points should be small, that is, the positions between the projection points are relatively close. Thus, it is possible to determine from the projected intensity whether there is a match between the motion assumption and the motion pattern of the movable platform. Thus, in step 302, a matching motion hypothesis may be determined from the plurality of motion hypotheses based on the projection densities corresponding to the plurality of motion hypotheses.

In particular, the motion hypothesis with the highest projection density may be determined as the matching motion hypothesis. Further, whether the difference between the motion hypothesis with the maximum projection density and any other projection density is larger than a preset value can be judged. If so, determining the motion hypothesis corresponding to the maximum projection density as a matching motion hypothesis. Further, whether the maximum projection density is greater than a preset projection density threshold value or not can be judged, and if the maximum projection density is greater than the preset projection density threshold value and the difference between the motion hypothesis with the maximum projection density and any other projection density is greater than a preset value, the motion hypothesis corresponding to the maximum projection density is determined as a matching motion hypothesis. In this way, the significance of the motion hypothesis can be improved, and therefore the accuracy and reliability of point cloud segmentation are improved. Determining that the matching motion hypothesis is not present if a difference between the maximum weight and the at least one other weight is not greater than a preset value, or the maximum projection density is not greater than a preset projection density threshold.

The following describes a manner of acquiring the projection density by taking the motion hypothesis a as an example, and the manner of acquiring the projection density corresponding to other motion hypotheses may be referred to as the manner of acquiring the projection density of the motion hypothesis a. The predetermined coordinate system may be pre-divided into a plurality of grids, and the areas and/or shapes of the grids may be the same or different. For example, for convenience of processing, the preset coordinate system may be divided into a plurality of rectangular grids of the same size in advance. Then, the projection densities of the motion hypotheses a within the respective grids are acquired, respectively. As shown in fig. 5A, each square represents a grid, and each number in the square represents the number of frames of the three-dimensional point cloud in which the projected points exist, and the graph shown in fig. 5A is referred to as a grid weight graph. The projection density may be determined based on a ratio of the number of frames of the three-dimensional point cloud in which projected points exist within the grid to the area of the grid. Since the areas of each grid are the same, the matching motion hypothesis can be determined directly based on the number of frames of the three-dimensional point cloud in which the projected points exist within the grid. It should be noted that, a frame of three-dimensional point cloud may have a plurality of projection points in a grid, and as long as there are projection points in the grid in the frame of three-dimensional point cloud, the number of frames of three-dimensional point cloud having projection points in the grid is increased by 1 no matter the number of projections. For example, the number of projection points of the three-dimensional point cloud 1, the three-dimensional point cloud 2, and the three-dimensional point cloud 3 in the grid 1 is 1, 3, and 0, respectively, and then the number of frames of the three-dimensional point cloud with the projection points in the grid 1 is recorded as 2. By counting the number of frames instead of the number of points of projection points falling into the grid, errors caused by uneven distribution of the number of three-dimensional points in different scanned regions can be reduced.

A grid weight map may be generated for each motion hypothesis, and H grid weight maps may be generated assuming H motion hypotheses. Each motion hypothesis h can convert all three-dimensional point cloud frames in the three-dimensional point cloud queue to a current frame according to the motion hypothesis h, and count the number of frames of the historical three-dimensional point cloud falling into each grid; on the contrary, if the motion hypothesis h is different from the real motion mode greatly, the historical point clouds injected by the motion hypothesis are overlapped with a small probability, and the corresponding weight is low.

Since the motion patterns of the three-dimensional points in the neighboring area are generally similar, if a matching motion assumption exists in the grid, the matching motion assumption of the grid can be determined as the matching motion assumption of the three-dimensional point corresponding to each point in the grid.

Then, a mask (mask) map can be generated based on the matching motion hypothesis, the mask map and the grid weight map have the same size, and each grid in the mask map comprises a grid parameter for recording the matching motion hypothesis of the grid. As shown in fig. 5B, h1 to h4 in the diagram respectively represent the matching motion hypotheses of the corresponding grids, and null represents that there is no matching motion hypothesis for the grids.

In step 303, point cloud segmentation may be performed on a first three-dimensional point cloud of the plurality of frames of three-dimensional point clouds, that is, it is determined whether a three-dimensional point in the first three-dimensional point cloud is a dynamic point or a static point. The dynamic point represents a three-dimensional point whose movement speed is not 0, and the static point represents a three-dimensional point whose movement speed is 0. If the speed of the matching motion hypothesis in one grid is 0, dividing the three-dimensional points projected to the grid in the first three-dimensional point cloud into static points. If the speed of the matching motion hypothesis in one grid is not 0, dividing the three-dimensional points projected to the grid in the first three-dimensional point cloud into dynamic points. The first three-dimensional point cloud may comprise a portion or all of the plurality of frames of three-dimensional point clouds. If there is no matching motion hypothesis in a grid, the three-dimensional points projected into the grid in the first three-dimensional point cloud may be segmented into three-dimensional points with unknown attributes.

Based on the point cloud segmentation result, labels can be marked for each three-dimensional point in the first three-dimensional point cloud. The label may include at least one of a number, a letter, and a symbol. Taking the example that the label includes a number, a dynamic point may be represented by bit 1, a static point may be represented by bit 0, and a three-dimensional point with unknown attribute may be represented by bit 01.

In practical application, the point cloud segmentation result can be used for a planning unit on the movable platform to plan the driving state of the movable platform. For example, the planning unit may determine the movement speed of the obstacle on the travel path based on the tag obtained from the point cloud segmentation result, thereby deciding whether the speed and attitude of the movable platform need to be controlled to avoid the obstacle. The point cloud segmentation result can also be output to a multimedia system on the movable platform, such as a display screen, a voice playing system, and the like, and used for outputting multimedia prompt information to a user.

Fig. 6 is a general flowchart of a point cloud segmentation method according to an embodiment of the present disclosure.

In step 601, a multi-motion hypothesis may be generated from a scene such as a highway, a city street, etc.

In step 602, a frame of three-dimensional point cloud and pose data of the movable platform at the time of acquiring the frame of three-dimensional point cloud may be synchronously received.

In step 603, points on the movable platform travel surface (e.g., ground) may be removed from the received three-dimensional point cloud.

In step 604, it may be determined whether to use the three-dimensional point cloud of the frame according to the frequency division setting, for example, if the time stamp of the three-dimensional point cloud of the frame and the result obtained by performing modulo operation for 200ms are 0, then the three-dimensional point cloud of the frame is used, and step 605 is executed; otherwise, the three-dimensional point cloud of the frame is not used, and step 606 is executed.

In step 605, the three-dimensional point cloud may be added to a point cloud queue.

In step 606, it is determined whether the point cloud queue has reached a minimum computable length. If so, go to step 607, otherwise return to step 602.

In step 607, H motion hypotheses may be injected for the current queue.

In step 608, the hypothesis may be verified and a mask graph generated. Namely, the three-dimensional point on the current frame is projected to a preset coordinate system by using each motion hypothesis, whether a matching motion hypothesis exists in each grid in the preset coordinate system is judged, and a mask graph is generated based on the matching motion hypothesis of each grid.

In step 609, the three-dimensional point cloud may be traversed and the motion mask map may be queried, thereby generating a label for each three-dimensional point in the three-dimensional point cloud for indicating that each three-dimensional point is a dynamic point or a static point.

In step 610, the tagged three-dimensional point cloud may be output to downstream modules, such as a planning module and a multimedia system of the mobile platform.

In step 611, it is determined whether the program is finished. If the procedure is not finished, return to step 602 to continue the point cloud segmentation.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

The embodiment of the present disclosure further provides a point cloud segmentation apparatus, which includes a processor, where the processor is configured to execute the following steps:

based on a motion hypothesis in a pre-established motion hypothesis model, projecting a plurality of frames of three-dimensional point clouds collected by the movable platform to a preset coordinate system, and acquiring a projection density corresponding to the motion hypothesis;

determining a matching motion hypothesis from the plurality of motion hypotheses based on projection densities corresponding to the plurality of motion hypotheses;

and performing point cloud segmentation on a first three-dimensional point cloud in the plurality of frames of three-dimensional point clouds based on the matching motion hypothesis.

In some embodiments, different motion hypotheses correspond to different motion speeds and/or motion directions.

In some embodiments, the range of motion velocities and/or the range of motion directions for the motion hypothesis are determined based on environmental characteristics of the environment in which the movable platform is located.

In some embodiments, the movement assumes a movement speed in the direction of travel of the movable platform in the range of [ -40m/s, 40m/s ], and a movement speed in a direction perpendicular to the direction of travel of the movable platform in the range of [ -10m/s, 10m/s ].

In some embodiments, the motion assumes a direction of motion in the range of [ -90 °, 90 °.

In some embodiments, the processor is configured to: and projecting the multi-frame three-dimensional point cloud to a preset coordinate system based on the motion hypothesis and the pose of the movable platform when acquiring each frame of three-dimensional point cloud in the multi-frame three-dimensional point cloud.

In some embodiments, each frame of the plurality of frames of three-dimensional point clouds includes a first label for indicating a three-dimensional point of the each frame of three-dimensional point clouds that is outside the travel surface of the movable platform; the processor is configured to: and projecting the three-dimensional points carrying the first label in each frame of three-dimensional point cloud to a preset coordinate system.

In some embodiments, the processor is further configured to: acquiring a plurality of frames of original three-dimensional point clouds collected by the movable platform; and carrying out frequency division processing on the multiple frames of original three-dimensional point clouds to obtain the multiple frames of three-dimensional point clouds.

In some embodiments, the processor is configured to: and if the difference between the maximum projection density and any other projection density is larger than a preset value, determining the motion hypothesis corresponding to the maximum projection density as a matching motion hypothesis.

In some embodiments, the processor is configured to: determining that the matching motion hypothesis does not exist if a difference between the maximum weight and at least one other weight is not greater than a preset value.

In some embodiments, the processor is configured to: the corresponding projection densities of the motion hypotheses within the respective grids are obtained.

In some embodiments, the processor is further configured to: and if the matched motion hypothesis exists in the grid, determining the matched motion hypothesis of the grid as the matched motion hypothesis of the three-dimensional point corresponding to each point in the grid.

In some embodiments, the processor is configured to: after the multi-frame three-dimensional point cloud is projected to a preset coordinate system based on the movement hypothesis, acquiring the frame number of the three-dimensional point cloud with the projection points in the grid; determining a ratio of the number of frames to an area of the grid as a corresponding projected density of the motion hypothesis within the grid.

In some embodiments, the processor is configured to: if the speed of the matched motion hypothesis in one grid is 0, segmenting the three-dimensional points projected to the grid in the first three-dimensional point cloud into static points; and/or if the speed of the matching motion hypothesis in one grid is not 0, dividing the three-dimensional points projected into the grid in the first three-dimensional point cloud into dynamic points.

In some embodiments, the processor is configured to: if no matching motion hypothesis exists in one grid, segmenting the three-dimensional points projected into the grid in the first three-dimensional point cloud into points with unknown attributes.

In some embodiments, the processor is further configured to: and marking each three-dimensional point in the first three-dimensional point cloud based on the point cloud segmentation result.

In some embodiments, the three-dimensional point cloud is acquired based on a vision sensor or a lidar mounted on the movable platform; and/or point cloud segmentation results obtained by performing point cloud segmentation on the first three-dimensional point cloud are used for planning the driving state of the movable platform by a planning unit on the movable platform.

For specific embodiments of the method executed by the processor in the point cloud segmentation apparatus according to the embodiments of the present disclosure, reference may be made to the foregoing method embodiments, and details are not repeated here.

Fig. 7 is a schematic diagram illustrating a more specific hardware structure of a data processing apparatus according to an embodiment of the present disclosure, where the apparatus may include: a processor 701, a memory 702, an input/output interface 703, a communication interface 704, and a bus 705. Wherein the processor 701, the memory 702, the input/output interface 703 and the communication interface 704 are communicatively connected to each other within the device via a bus 705.

The processor 701 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present specification.

The Memory 702 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 702 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 702 and called to be executed by the processor 701.

The input/output interface 703 is used for connecting an input/output module to realize information input and output. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.

The communication interface 704 is used for connecting a communication module (not shown in the figure) to realize communication interaction between the device and other devices. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).

Bus 705 includes a pathway for communicating information between various components of the device, such as processor 701, memory 702, input/output interface 703, and communication interface 704.

It should be noted that although the above-mentioned device only shows the processor 701, the memory 702, the input/output interface 703, the communication interface 704 and the bus 705, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.

As shown in fig. 8, embodiments of the present disclosure also provide a movable platform 800, including a housing 801; the point cloud acquisition device 802 is arranged on the shell 801 and is used for acquiring three-dimensional point cloud; and a three-dimensional point cloud segmentation device 803, disposed in the housing 801, for performing the method according to any embodiment of the present disclosure. Wherein, the movable platform 800 can be an unmanned aerial vehicle, an unmanned ship, a movable robot, etc., and the point cloud collection device 802 can be a vision sensor (such as a binocular vision sensor, a trinocular vision sensor, etc.) or a laser radar.

The embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps performed by the second processing unit in the method according to any of the foregoing embodiments.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

From the above description of the embodiments, it is clear to those skilled in the art that the embodiments of the present disclosure can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the embodiments of the present specification may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.

Various technical features in the above embodiments may be arbitrarily combined as long as there is no conflict or contradiction in the combination between the features, but the combination is limited by the space and is not described one by one, and therefore, any combination of various technical features in the above embodiments also belongs to the scope of the present disclosure.

Other embodiments of the present disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

The above description is only exemplary of the present disclosure and should not be taken as limiting the disclosure, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims

1. A three-dimensional point cloud segmentation method is used for performing point cloud segmentation on a three-dimensional point cloud acquired by a movable platform, and comprises the following steps:

2. Method according to claim 1, characterized in that different movement hypotheses correspond to different movement speeds and/or movement directions.

3. The method of claim 2, wherein the range of motion velocities and/or the range of motion directions for the motion hypothesis are determined based on environmental characteristics of an environment in which the movable platform is located.

4. A method according to claim 3, characterized in that said movement assumes a movement speed in the travelling direction of the movable platform in the range of [ -40m/s, 40m/s ], a movement speed in the direction perpendicular to the travelling direction of the movable platform in the range of [ -10m/s, 10m/s ]; and/or

The assumed direction of movement of the movement is in the range of [ -90 °, 90 °.

5. The method of claim 1, wherein the projecting the plurality of frames of three-dimensional point clouds collected by the movable platform under a preset coordinate system based on the motion hypothesis in the pre-established motion hypothesis model comprises:

and projecting the multi-frame three-dimensional point cloud to a preset coordinate system based on the motion hypothesis and the pose of the movable platform when acquiring each frame of three-dimensional point cloud in the multi-frame three-dimensional point cloud.

6. The method according to claim 1, wherein each frame of the plurality of frames of three-dimensional point clouds comprises a first label for indicating three-dimensional points of the each frame of three-dimensional point clouds which belong to the outside of the traveling road surface of the movable platform; the projecting the multi-frame three-dimensional point cloud collected by the movable platform to a preset coordinate system comprises the following steps:

and projecting the three-dimensional points carrying the first label in each frame of three-dimensional point cloud to a preset coordinate system.

7. The method of claim 1, further comprising:

acquiring a plurality of frames of original three-dimensional point clouds collected by the movable platform;

and carrying out frequency division processing on the multiple frames of original three-dimensional point clouds to obtain the multiple frames of three-dimensional point clouds.

8. The method of claim 1, wherein determining a matching motion hypothesis from the plurality of motion hypotheses based on projection densities corresponding to the plurality of motion hypotheses comprises:

and if the difference between the maximum projection density and any other projection density is larger than a preset value, determining the motion hypothesis corresponding to the maximum projection density as a matching motion hypothesis.

9. The method of claim 8, wherein determining the matching motion hypothesis based on a plurality of the motion hypothesis correspondence weights comprises:

determining that the matching motion hypothesis does not exist if a difference between the largest weight and at least one other weight is not greater than a predetermined value.

10. The method of claim 1, wherein the preset coordinate system comprises a plurality of grids; the acquiring of the projection density corresponding to the motion hypothesis includes:

the corresponding projection densities of the motion hypotheses within the respective grids are obtained.

11. The method of claim 10, further comprising:

and if the matching motion hypothesis exists in the grid, determining the matching motion hypothesis of the grid as the matching motion hypothesis of the three-dimensional point corresponding to each point in the grid.

12. The method of claim 10, wherein the corresponding projected density of the motion hypotheses within the grid is determined based on:

after the multi-frame three-dimensional point cloud is projected to a preset coordinate system based on the movement hypothesis, acquiring the frame number of the three-dimensional point cloud with the projection points in the grid;

determining a ratio of the number of frames to an area of the grid as a corresponding projected density of the motion hypothesis within the grid.

13. The method of claim 10, wherein said point cloud segmentation of a first three-dimensional point cloud of the plurality of three-dimensional point clouds based on the matching motion hypothesis comprises:

if the speed of the matched motion hypothesis in one grid is 0, segmenting the three-dimensional points projected to the grid in the first three-dimensional point cloud into static points; and/or

And if the speed of the matching motion hypothesis in one grid is not 0, segmenting the three-dimensional points projected to the grid in the first three-dimensional point cloud into dynamic points.

14. The method of claim 13, wherein the point cloud segmentation of a first three-dimensional point cloud of the plurality of three-dimensional point clouds based on the matching motion hypothesis, further comprising:

if no matching motion hypothesis exists in one grid, segmenting the three-dimensional points projected into the grid in the first three-dimensional point cloud into points with unknown attributes.

15. The method of claim 1, wherein after point cloud segmentation of a first three-dimensional point cloud of the plurality of frames of three-dimensional point clouds based on the matching motion hypothesis, the method further comprises:

and marking each three-dimensional point in the first three-dimensional point cloud based on the point cloud segmentation result.

16. The method of claim 1, wherein the three-dimensional point cloud is acquired based on a vision sensor or a lidar mounted on the movable platform; and/or

And point cloud segmentation results obtained by performing point cloud segmentation on the first three-dimensional point cloud are used for a planning unit on the movable platform to plan the driving state of the movable platform.

17. A three-dimensional point cloud segmentation device comprises a processor, and is characterized in that the three-dimensional point cloud segmentation device is used for performing point cloud segmentation on a three-dimensional point cloud acquired by a movable platform, and the processor is used for executing the following steps:

18. The apparatus of claim 17, wherein different motion hypotheses correspond to different motion speeds and/or motion directions.

19. The apparatus of claim 18, wherein the range of motion velocities and/or the range of motion directions for the motion hypothesis are determined based on environmental characteristics of an environment in which the movable platform is located.

20. The apparatus of claim 19, wherein said movement assumes a movement speed in the direction of travel of said movable platform within the range of [ -40m/s, 40m/s ] and a movement speed in a direction perpendicular to the direction of travel of said movable platform within the range of [ -10m/s, 10m/s ]; and/or

21. The apparatus of claim 17, wherein the processor is configured to:

22. The apparatus of claim 17, wherein each frame of the plurality of frames of three-dimensional point clouds includes a first tag indicating a three-dimensional point of the each frame of three-dimensional point clouds that is outside of the travel surface of the movable platform; the processor is configured to:

23. The apparatus of claim 17, wherein the processor is further configured to:

24. The apparatus of claim 17, wherein the processor is configured to:

25. The apparatus of claim 24, wherein the processor is configured to:

determining that the matching motion hypothesis does not exist if a difference between the maximum weight and at least one other weight is not greater than a preset value.

26. The apparatus of claim 17, wherein the processor is configured to:

27. The apparatus of claim 26, wherein the processor is further configured to:

28. The apparatus of claim 26, wherein the processor is configured to:

29. The apparatus of claim 26, wherein the processor is configured to:

if the speed of the matching motion hypothesis in one grid is 0, dividing the three-dimensional points projected to the grid in the first three-dimensional point cloud into static points; and/or

If the speed of the matching motion hypothesis in one grid is not 0, dividing the three-dimensional points projected to the grid in the first three-dimensional point cloud into dynamic points.

30. The apparatus of claim 29, wherein the processor is configured to:

31. The apparatus of claim 17, wherein the processor is further configured to:

32. The apparatus of claim 17, wherein the three-dimensional point cloud is acquired based on a vision sensor or a lidar mounted on the movable platform; and/or

33. A movable platform, comprising:

a housing;

the point cloud acquisition device is arranged on the shell and used for acquiring three-dimensional point cloud; and

a three-dimensional point cloud segmentation apparatus disposed within the housing for performing the method of any one of claims 1 to 16.

34. A computer-readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the method of any one of claims 1 to 16.