CN115994933A

CN115994933A - Partial point cloud registration method based on consistency learning

Info

Publication number: CN115994933A
Application number: CN202310095301.6A
Authority: CN
Inventors: 秦红星; 谭博元
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2023-02-10
Filing date: 2023-02-10
Publication date: 2023-04-21

Abstract

The invention relates to a partial point cloud registration method based on consistency learning, which belongs to the field of computer vision and comprises the following steps: s1: collecting a source point cloud P, a target point cloud Q and real transformation from an actual scene, constructing a data set, preprocessing the data set, and dividing the data set into a training set and a testing set; s2: constructing a partial point cloud registration model, wherein the partial point cloud registration model comprises a point consistency learning module, a correspondence consistency learning module and an estimated rigidity transformation module; the point consistency learning module firstly extracts the local characteristics of each point, then estimates the corresponding overlapping score, and finally removes the points in the non-overlapping area according to the overlapping score of each point; s3: training the partial point cloud registration model by using a training set, and testing by using a testing set; s4: and initializing the source point cloud and the target point cloud to be registered, inputting a trained partial point cloud registration model, and outputting rigid transformation between the source point cloud and the target point cloud.

Description

Partial point cloud registration method based on consistency learning

Technical Field

The invention belongs to the field of computer vision, and relates to a partial point cloud registration method based on consistency learning.

Background

Point cloud registration is one of the fundamental tasks in the field of computer vision, which is widely used in the fields of simultaneous localization and mapping (SLAM), autopilot, 3D reconstruction, and the like. The goal of point cloud registration is to try to find a rigid transformation (rotation matrix and translation vector) between two 3D point clouds (source point cloud and target point cloud) to align them. Unfortunately, point cloud registration in partially overlapping scenes becomes more challenging due to the negative impact of non-overlapping areas of the point clouds.

In recent years, many approaches have been proposed to solve the point cloud registration problem. Iterative Closest Point (ICP) is one of the most classical algorithms, which iteratively performs the two steps of finding the closest point in another point cloud and solving the optimal transformation using SVD and updating the source point cloud state until the algorithm converges. However, ICP tends to fall into a local optimum and is sensitive to noise, limiting its application. For this reason, go-ICP, symmetric-ICP, generalized-ICP and the like have been proposed to address these challenges. In recent years, deep learning-based methods have made a great breakthrough. The depth nearest point (DCP) uses the nearest neighbor of the feature space instead of the nearest neighbor of the euclidean space. IDAM uses a convolutional neural network to directly predict the mapping matrix. Deep gmr uses a gaussian mixture model to estimate the optimal transformation. However, none of these methods assumes registration in a partially overlapping scene and therefore has poor performance. PRNet solves part-to-part point cloud registration by extracting keypoints from the point cloud and using them to construct a mapping matrix. RPMNet uses sink horn regularization to compute the mapping matrix. Deep bbs improves registration performance in partially overlapping scenarios by using soft mutual neighbor constraints.

Most of the current methods based on deep learning divide point cloud registration into three parts: extracting the characteristics of the points, constructing corresponding relations based on characteristic matching, and estimating transformation parameters. However, the negative impact of non-overlapping regions in partially overlapping scenes can result in an inability to learn effective feature space.

Most of the current methods cannot effectively identify and remove the wrong correspondence. When there is an excessive proportion of wrong correspondence, the estimated transformation parameters and the actual transformation are excessively different, and the registration is failed.

Disclosure of Invention

In view of the above, the present invention aims to provide a partial point cloud registration method based on consistency learning, which predicts an overlap score for each point by using a consistency constraint of a point hierarchy according to a negative effect of a non-overlapping region in a partially overlapped scene, and removes the effect of the non-overlapping region point according to the overlap score; aiming at the problem of excessive error corresponding relations, the error corresponding relations are identified by exploring local to global context information between the corresponding relations, and are removed.

In order to achieve the above purpose, the present invention provides the following technical solutions:

a partial point cloud registration method based on consistency learning comprises the following steps:

s1: collecting a source point cloud P, a target point cloud Q and real transformation from an actual scene, constructing a data set, preprocessing the data set, and dividing the data set into a training set and a testing set;

s2: constructing a partial point cloud registration model, wherein the partial point cloud registration model comprises a point consistency learning module, a correspondence consistency learning module and an estimated rigidity transformation module; the point consistency learning module firstly extracts the local characteristics of each point, then estimates the corresponding overlapping score, and finally removes the points in the non-overlapping area according to the overlapping score of each point;

s3: training the partial point cloud registration model by using a training set, and testing by using a testing set;

s4: and initializing the source point cloud and the target point cloud to be registered, inputting a trained partial point cloud registration model, and outputting rigid transformation between the source point cloud and the target point cloud.

Further, preprocessing the data set, including removing outliers in the point cloud, voxel downsampling, and regularization.

Further, the specific calculation steps of the point consistency learning module are as follows:

a1: using a graph neural network from a point p in a source point cloud _i E P and Point q in the target Point cloud _j Extracting local characteristics of points from the E Q;

a2: predicting the overlapping score of each point;

a3: points with overlapping scores less than 70% of the point cloud size are considered as points of a non-overlapping region, and the points of the non-overlapping region are removed.

Further, the step A1 specifically includes the following steps:

a11: for each point p _i E P, firstly, performing k neighbor search to obtain k neighbor point set thereof

A12: each point p _i The difference between the 3D coordinates of e P and the 3D coordinates of its neighborhood points constitutes the initial input feature:

a13: using a neural network f _g Will be

Mapping to a high-dimensional feature space, the resulting feature is denoted +.>

Wherein, the graph neural network f _g (. Cndot.) the interior is divided into four layers, each expressed as:

in the method, in the process of the invention,

is an output feature and is also an input feature of the next layer; MLP is a multi-layer perceptron; maxPooling represents a max pooling operation; />

Is->

K neighbor sets of (a); []Representing stitching along the channel direction of the feature;

a14: the same method is adopted for the point q in the target point cloud _j Performing feature extraction on the E Q, wherein the obtained features are as follows

Wherein->

Forming an initial input characteristic for the difference between the 3D coordinates of each point in the target point cloud Q and the 3D coordinates of the neighborhood points;

a15: local feature point-to-point with self-attention and cross-attention modules

And->

Performing characteristic enhancement, and marking the enhanced characteristics as +.>

And->

Further, the step A2 specifically includes the following steps:

a21: for the extracted

And->

Applying a max-pooling operation to obtain global features G of an entire point cloud ^P ＝maxpooling(C ^P ) And G ^Q ＝maxpooling(C ^Q )；

A22: the global characteristics of two point clouds and the characteristics of each point are spliced and input into a multi-layer perceptron to predict the overlapping score of each point

And

wherein MLP is a multi-layer perceptron, the output dimension is N multiplied by 1, N is the number of points in the point cloud; delta (·) is the repeat operation along the characteristic channel direction; [ x, y, z]Representing stitching of features x, y and z in the channel dimension; o (O) ^P And O ^Q Representing overlapping sets of scores for point cloud P and point cloud Q, respectively.

Further, the specific calculation steps of the correspondence consistency learning module are as follows:

b1: defining a mapping matrix M:

wherein M is _ij For the elements of the ith row and jth column in the mapping matrix M;

and->

Respectively point p _i Sum point q _j The characteristics after strengthening by the self-attention and cross-attention modules; />

Representing warpThe inner product of the eigenvector amplified by the exponential function can be regarded as an intermediate variable of the calculation.

For one point P in the source point cloud P _r Its corresponding point in the target point cloud Q is

Thereby obtaining a set of initial corresponding relations

B2: searching local to global context information between the corresponding relations to identify the wrong corresponding relation;

b3: and removing the wrong corresponding relation.

Further, the step B2 specifically includes:

b21: first, an initial corresponding relation feature set H is calculated ^C ：

In the method, in the process of the invention,

representing the correspondence c _r ＝(p _r ,q _r ) Initial characteristics of e C; [ (i. ], i]The method comprises the steps of representing that features are spliced in a channel dimension; />

Correspondence feature representing distance perception +.>

Representation->

Is the j-th neighbor,/, of>

Is->

Is the modulo length of the vector;

b22: inputting the initial corresponding relation characteristic into a multi-scale attention module to explore local to global context information; the multi-scale attention module comprises three layers of identical operations, each layer operating as follows:

features to be characterized

Input to a neural network to explore local consistency, and output is characterized as Y _r ⁽¹⁾ ；

Will local feature Y _r ⁽¹⁾ Input into an inner product attention module, output feature F _r ⁽¹⁾ Then calculate the offset feature OF _r ⁽¹⁾ ＝Y _r ⁽¹⁾ -F _r ⁽¹⁾ Then the offset feature OF _r ⁽¹⁾ Inputting into a multi-layer perceptron to obtain

The final characteristic is->

B23: the three-layer characteristics are maximally pooled, the obtained characteristics are all spliced together and input into a multi-layer perceptron to predict the weight W=MLP ([ F) of each corresponding relation ⁽¹⁾ ,F ⁽²⁾ ,F ⁽³⁾ ]) The method comprises the steps of carrying out a first treatment on the surface of the Wherein MLP is a multi-layer perceptron which ultimately outputsThe dimension is |C| multiplied by 1, and |C| is the size of the corresponding relation set C.

In step B3, whether a certain corresponding relationship is to be reserved is determined according to the weight W, and the final corresponding relationship set is recorded as

The corresponding weight is->

Further, the calculation steps of the estimated rigid transformation module are as follows:

according to the obtained corresponding relation set

Weight +.>

The final transformation is obtained by solving the following objective function: />

Wherein (R) ^e ,t ^e ) For the final solved transformation parameters, where R ^e For rotating matrix, t ^e Is a translation vector;

is of corresponding relationship (p) _i ,q _i ) Corresponding weights;

the objective function is calculated by a weighted singular value decomposition method.

Further, in the step S3, when the partial point cloud registration model is trained by using a training set, the adopted loss function is divided into three parts: mapping matrix loss, point prediction loss and corresponding relation prediction loss, wherein the point prediction loss and the corresponding relation prediction loss are binary cross entropy loss;

mapping matrix loss L ₁ The method comprises the following steps:

j ^* ＝argmin _j ||R ^* p _i +t ^* -q _j || ²

wherein M is a mapping matrix; a, a _i Is a parameter that determines whether the current item is calculated, if it is satisfied

Then a _i =1; otherwise it is a _i ＝0；R ^* And t ^* The real rotation matrix and the translation vector are respectively; p is p _i And q _j One point of the source point cloud P and the target point cloud Q respectively; j (j) ^* Representing point p _i The number of the true corresponding point in the target point cloud Q;

point prediction loss L ₂₁ And L ₂₂ The method comprises the following steps:

wherein N is the size of the point cloud;

and->

Respectively the predicted point p _i Sum point q _j Is a superposition fraction of (2); />

For point p _i True overlap score, if p _i In the overlapping region->

Otherwise->

For point q _i True overlap score, if q _i In the overlapping region->

Otherwise->

Corresponding relation prediction loss L ₃ The method comprises the following steps:

wherein w is _i E W is the weight of the predicted ith correspondence,

is the weight of the real i-th corresponding relation;

the final loss is

L＝L ₁ +L ₂₁ +L ₂₂ +L ₃

Where L represents the final loss.

The invention has the beneficial effects that: in the point consistency learning stage, the invention radically solves the negative influence of a non-overlapping area by predicting the overlapping area of the source point cloud and the target point cloud, converts partial-to-partial point cloud registration into approximately complete-to-complete point cloud registration, and greatly reduces task difficulty. In the correspondence consistency learning stage, the method acquires the initial correspondence through the similarity between the features, and then identifies the mismatching through exploring the local-global context information between the initial correspondences, thereby realizing accurate point cloud registration.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.

Drawings

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:

FIG. 1 is a flow chart of a partial point cloud registration method based on consistency learning;

fig. 2 is a registration result diagram.

Detailed Description

Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.

Wherein the drawings are for illustrative purposes only and are shown in schematic, non-physical, and not intended to limit the invention; for the purpose of better illustrating embodiments of the invention, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the size of the actual product; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The same or similar reference numbers in the drawings of embodiments of the invention correspond to the same or similar components; in the description of the present invention, it should be understood that, if there are terms such as "upper", "lower", "left", "right", "front", "rear", etc., that indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, it is only for convenience of describing the present invention and simplifying the description, but not for indicating or suggesting that the referred device or element must have a specific azimuth, be constructed and operated in a specific azimuth, so that the terms describing the positional relationship in the drawings are merely for exemplary illustration and should not be construed as limiting the present invention, and that the specific meaning of the above terms may be understood by those of ordinary skill in the art according to the specific circumstances.

As shown in fig. 1, the present invention provides a partial point cloud registration method based on consistency learning, wherein given a source point cloud P and a target point cloud Q, rigid transformations R and t between the two point clouds are found through three stages: point consistency learning, correspondence consistency learning, and estimation rigidity transformation R, t.

Point consistency learning phase:

the point consistency learning first extracts the local features of each point, then estimates the corresponding overlap score, and finally removes the points of the non-overlapping region according to the overlap score of each point.

First from p using a graph neural network (Graph Neural Network) _i E P and q _j The local characteristics of the points are extracted from the E Q, and the characteristics can capture the local information of the points and pay attention to the local fine granularity structure in the neighborhood range. Specifically, for each point p _i E P, firstly, performing k neighbor search to obtain k neighbor point set thereof

The difference between the 3D coordinates of each point and the 3D coordinates of its neighboring points is then used to form the initial input feature +.>

Then use a neural network f _g Will->

Point q in a target point cloud _j E Q also uses similar operations for feature extraction, the resulting feature is +.>

Local features of dots->

And->

Similarity between them expresses the point p _i And q _j Is a local neighborhood of the object. However, in order to better predict the overlapping score of each point and then construct a reliable correspondence, it is necessary to strengthen the extracted local features. Thus, we use the self-attention and cross-attention modules for feature enhancement. The reinforced features are respectively marked as +.>

And->

To predict the overlap score of each point, i.e., the probability of each point being in the overlap region, for the extracted

And

applying a max-pooling operation to obtain global features G of an entire point cloud ^P ＝maxpooling(C ^P ) And G ^Q ＝maxpooling(C ^Q ). Then the global features of the two point clouds and the features of each point are spliced and input into a multi-layer perceptron (MLP) to predict the overlapping score of each point +.>

And->

Points with overlapping scores less than 70% of the point cloud size are regarded as points of the non-overlapping area, and finally the points are removed, and the rest points participate in subsequent operations.

Correspondence consistency learning stage:

through the point consistency learning phase, points located in non-overlapping areas have been removed. The rest point clouds are still marked as P and Q, and the corresponding characteristics are still marked as C ^P And C ^Q . In order to obtain the mapping relationship between two point clouds, the inner product of the feature vectors is used to measure the similarity of the features. Defining a mapping matrix M, wherein

For one point P in the source point cloud P _r Its corresponding point in the target point cloud Q is +.>

Thus an initial set of correspondences is obtained>

However, there is some mismatch in the initial correspondence C obtained, which would seriously affect the final registration result if it could not be removed. Therefore, the invention identifies the wrong corresponding relation by exploring the local to global context information between the corresponding relations, then removes the wrong corresponding relation, and finally solves the rigid transformation by using the residual correct corresponding relation. For each corresponding relation, first calculating initial corresponding relation characteristic

The feature is then input into aLocal to global context information is explored in a plurality of multi-scale attention modules. The multi-scale attention module has three layers, and since the operations between the different layers are identical, we will only describe the operation of the first layer in the following. Features to be characterized

Input to a neural network to explore local consistency, and output is characterized as Y _r ⁽¹⁾ . Thereafter Y is taken _r ⁽¹⁾ Input to an offset non-local (offsetnon) module explores global consistency. In particular, local feature Y _r ⁽¹⁾ Input into an inner product attention module, output feature F _r ⁽¹⁾ . Then calculate the offset feature OF _r ⁽¹⁾ ＝Y _r ⁽¹⁾ -F _r ⁽¹⁾ Then the offset feature OF _r ⁽¹⁾ Inputting into a multi-layer perceptron to obtain +.>

The final characteristics are that

Finally, we apply maximum pooling to the three-layer features, splice the features together, input them into a multi-layer perceptron to predict the weights w=mlps ([ F) for each correspondence ⁽¹⁾ ,F ⁽²⁾ ,F ⁽³⁾ ]). And determining whether a certain corresponding relation is to be reserved according to the weight W. In the implementation, the first 50% of corresponding relation is reserved for subsequent solving transformation according to the size of the weight W, and the final corresponding relation set is recorded as +.>

The corresponding weight is->

Estimating a rigid transformation:

according to the obtained corresponding relation set

Weight +.>

The final transformation may be obtained by solving the following objective function: />

The objective function may be solved by Weighted singular value decomposition (Weighted SVD).

Training part:

training data, namely source point cloud, target point cloud and real transformation are prepared according to an actual scene: if there are a large number of outliers in the point cloud, an operation of removing outliers is required to improve the quality of the point cloud, and using a radius outlier removal (remote_radius_outlier) function of the Open3D tool, points less than 30 points in a given sphere within a radius of 0.05m are deleted. Voxel downsampling (voxel down sample) was performed using Open3D with a voxel size set to 0.025. Then randomly sampling 1024 points from the downsampled source point cloud and target point cloud respectively, and marking the sampled points as P epsilon R ^1024×3 And Q.epsilon.R ^1024×3 . Finally regularizing P and Q:

in 3D space, a random direction transformation T (comprising 0 ° to 360 ° rotations and translations within plus or minus 0.05 unit distances) is generated, which is applied to the target point cloud Q.

Inputting the point clouds P and Q into the partial point cloud registration model of the invention, recording the generated mapping matrix M and overlapping the score O ^P And O ^Q And a weight W of the correspondence. The loss value is calculated from these data and the actual data, and the update parameters are back-propagated. The above description is of the operation performed for one epoch.

The total epoch in the training part is set to be 1000, the optimizer used is Adam, and the initial learning rate is 0.0001,batch size and is set to be 8. The loss function used is divided into three parts: mapping matrix loss, point prediction loss and correspondence prediction loss. Wherein the point prediction loss and the correspondence prediction loss are both binary cross entropy losses.

(1) Mapping matrix loss L ₁ The method comprises the following steps:

j ^* ＝argmin _j ||R ^* p _i +t ^* -q _j || ²

Then a _i =1; otherwise it is a _i ＝0；R ^* And t ^* The real rotation matrix and the translation vector are respectively; p is p _i And q _j One point in the source point cloud P and the target point cloud Q, respectively. j (j) ^* Representing point p _i And the number of the real corresponding point in the target point cloud Q.

(2) Point prediction loss L ₂₁ And L ₂₂ The method comprises the following steps:

wherein N is the size of the point cloud;

and->

For point p _i True overlap score, if p _i In the overlapping region->

Otherwise->

For point q _i True overlap score, if q _i In the overlapping region->

Otherwise->

/>

(3) Corresponding relation prediction loss L ₃ The method comprises the following steps:

wherein w is _i E W is the weight of the predicted ith correspondence,

is the weight of the true i-th correspondence.

The final loss L is

L＝L ₁ +L ₂₁ +L ₂₂ +L ₃

Testing and practical application:

1. preparing a source point cloud and a target point cloud to be registered, and preprocessing the source point cloud and the target point cloud: outliers are removed, voxel downsampling and regularization.

2. Loading a trained model for estimating the transformation, and recording the model as f _θ Where θ represents the pre-training parameters.

3. Inputting the preprocessed point clouds P and Q into a trained model to solve the transformation (R, t) =f _θ (P,Q)。

4. The last step finds the transformation (R, t) of the regularized space, and therefore requires mapping back to the non-normalized original space to get the final result:

as shown in fig. 2, the registration result chart of the present embodiment corresponds to the results of unseen shapes, unseen categories and unseen shapes with noise, respectively, in the first three rows. Inputs are the input source point cloud and target point cloud. GT is the registration result of group-trunk.

Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims

1. A partial point cloud registration method based on consistency learning is characterized by comprising the following steps: the method comprises the following steps:

2. The consistency learning-based partial point cloud registration method as claimed in claim 1, wherein: preprocessing the data set, including removing outliers in the point cloud, performing voxel downsampling and regularization.

3. The consistency learning-based partial point cloud registration method as claimed in claim 1, wherein: the specific calculation steps of the point consistency learning module are as follows:

a2: predicting the overlapping score of each point;

4. The consistency learning-based partial point cloud registration method as claimed in claim 3, wherein: the step A1 specifically comprises the following steps:

A13：using a neural network f _g Will be

in the method, in the process of the invention,

Is->

Wherein->

And->

And->

5. The consistency learning-based partial point cloud registration method as claimed in claim 3, wherein: the step A2 specifically comprises the following steps:

a21: for the extracted

And->

And

6. The consistency learning-based partial point cloud registration method as claimed in claim 1, wherein: the specific calculation steps of the correspondence consistency learning module are as follows:

b1: defining a mapping matrix M:

and->

Representing the inner product of the eigenvectors amplified by the exponential function, taken as the calculation M _ij Is an intermediate variable of (a);

Thereby obtaining a set of initial corresponding relations

b3: and removing the wrong corresponding relation.

7. The consistency learning-based partial point cloud registration method as claimed in claim 6, wherein: the step B2 specifically includes:

In the method, in the process of the invention,

Correspondence feature representing distance perception +.>

Representation->

Is the j-th neighbor,/, of>

Is that

Is the modulo length of the vector;

features to be characterized

The final characteristic is->

/>

B23: maximizing pooling of three layers of features, stitching all obtained features together and inputting the features into a multi-layer perceptionWeight w=mlp ([ F) for predicting each correspondence in the machine ⁽¹⁾ ,F ⁽²⁾ ,F ⁽³⁾ ]) The method comprises the steps of carrying out a first treatment on the surface of the The MLP is a multi-layer perceptron, the dimension of the final output of the MLP is |C|x 1, and |C| is the size of the corresponding relation set C.

8. The consistency learning-based partial point cloud registration method as claimed in claim 6, wherein: in the step B3, it is determined whether a certain corresponding relation is to be reserved according to the weight W, and the final corresponding relation set is recorded as

The corresponding weight is->

9. The consistency learning-based partial point cloud registration method as claimed in claim 1, wherein: the calculation steps of the estimated rigid transformation module are as follows:

according to the obtained corresponding relation set

Weight +.>

The final transformation is obtained by solving the following objective function:

is of corresponding relationship (p) _i ,q _i ) Corresponding weights;

10. The consistency learning-based partial point cloud registration method as claimed in claim 1, wherein: in the step S3, when the training set is used to train the partial point cloud registration model, the adopted loss function is divided into three parts: mapping matrix loss, point prediction loss and corresponding relation prediction loss, wherein the point prediction loss and the corresponding relation prediction loss are binary cross entropy loss;

mapping matrix loss L ₁ The method comprises the following steps: