CN115170626A

CN115170626A - Unsupervised method for robust point cloud registration based on depth features

Info

Publication number: CN115170626A
Application number: CN202210847039.1A
Authority: CN
Inventors: 陈明; 韦升喜; 肖远辉; 田旭; 李祺峰; 吴冬柳
Original assignee: Guangxi Normal University
Current assignee: Guangxi Normal University
Priority date: 2022-07-07
Filing date: 2022-07-07
Publication date: 2022-10-11

Abstract

The invention discloses an unsupervised method for carrying out robust point cloud registration based on depth characteristics, which comprises the following steps: 1) Acquiring point cloud data; 2) Converting; 3) Extracting characteristics; 4) Point cloud registration; 5) And (5) training. The method is an unsupervised network, the registration framework is trained in an unsupervised mode by combining global and local high-level features to learn and extract the deep features, expensive calculation is not needed to be carried out on point correspondence, and great advantages are achieved in the aspects of precision, initialization robustness and calculation efficiency.

Description

An unsupervised method for robust point cloud registration based on deep features

技术领域technical field

本发明涉及物体三维重建和定位，具体是一种基于深度特征进行鲁棒点云配准的无监督方法。The invention relates to three-dimensional reconstruction and positioning of objects, in particular to an unsupervised method for robust point cloud registration based on depth features.

背景技术Background technique

点云配准是估计两个点云对齐的刚性变换的问题。它在自动驾驶、运动和姿态估计、三维重建、同时定位和映射(SLAM)以及增强现实等各个领域都有许多应用。Point cloud registration is the problem of estimating a rigid transformation that aligns two point clouds. It has many applications in various fields such as autonomous driving, motion and pose estimation, 3D reconstruction, simultaneous localization and mapping (SLAM), and augmented reality.

最近，一些基于深度学习(deep learning,简称DL)的方法被提出来处理大旋转角度。(Y.Wang and J.M.Solomon,“Deep closest point:Learning representations forpoint cloud registration,”in Proceedings of the IEEE International Conferenceon Computer Vision(ICCV),2019,pp.3523–3532.)(Y.Aoki,H.Goforth,R.A.Srivatsan,and S.Lucey,“Pointnetlk:Robust&efficient point cloud registration usingpointnet,”in Proceedings of the IEEE Conference on Computer Vision andPattern Recognition(CVPR),2019,pp.7163–7172)(V.Sarode,X.Li,H.Goforth,Y.Aoki,R.A.Srivatsan,S.Lucey,and H.Choset,“Pcrnet:point cloud registration networkusing pointnet encoding,”arXiv preprint arXiv:1908.07906,2019.)(X.Huang,G.Mei,and J.Zhang,“Feature-metric registration:A fast semi-supervisedapproach for robust point cloud registration without correspondences,”inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2020,pp.11 366–11 374.)Recently, some deep learning (DL) based methods have been proposed to deal with large rotation angles. (Y. Wang and J.M. Solomon, “Deep closest point: Learning representations for point cloud registration,” in Proceedings of the IEEE International Conferenceon Computer Vision (ICCV), 2019, pp. 3523–3532.) (Y. Aoki, H. Goforth , R.A. Srivatsan, and S. Lucey, "Pointnetlk: Robust&efficient point cloud registration using pointnet," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 7163–7172) (V. Sarode, X. Li , H. Goforth, Y. Aoki, R. A. Srivatsan, S. Lucey, and H. Choset, "Pcrnet: point cloud registration network using pointnet encoding," arXiv preprint arXiv: 1908.07906, 2019.) (X. Huang, G.Mei, and J. Zhang, “Feature-metric registration: A fast semi-supervised approach for robust point cloud registration without correspondences,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2020, pp.11 366–11 374. )

粗略地说，它们可以分为两类:依赖于基本事实对应或类别标签的监督方法和非监督方法。深度最近点(Deep nearest Point,简称DCP)通过奇异值分解(singular valuedecomposition,简称SVD)计算刚性变换，其中通过学习软匹配映射构建对应关系。PointNetLK使用经典的对齐技术，如Lucas-Kanade(LK)算法来对齐PointNet特征，对训练中看不到的形状产生良好的泛化能力。然而，它们依赖于大量的配准标签数据，这使得该算法不实用，因为3D配准标签非常消耗劳动力。相比之下，从没有地面真实对应的未标记点云数据实现配准是一个重大的挑战。PCRNet通过用多层感知器替换Lucas-Kanade模块，缓解了PointNetLK中所示的姿态偏差。PCRNet直接从源点云和目标点云的串联全局描述符中恢复转换参数。FMR-Net采用编码器-解码器任务实现无监督框架，同时通过最小化特征度量投影误差来实现配准。虽然这些方法展示了无监督学习的更突出的优势，但它们主要依赖全局表示的深度特征，而忽略了局部表示的深度特征。从而没有完全充分利用点云的深度特征进行配准，没有让配准效果达到完美。Roughly speaking, they can be divided into two categories: supervised methods that rely on ground-truth correspondences or class labels, and unsupervised methods. The deep nearest point (DCP for short) calculates the rigid transformation through singular value decomposition (SVD for short), in which the correspondence is constructed by learning the soft matching map. PointNetLK uses classical alignment techniques such as the Lucas-Kanade (LK) algorithm to align PointNet features, yielding good generalization ability to shapes not seen during training. However, they rely on a large amount of registration label data, which makes the algorithm impractical because 3D registration labels are very labor-intensive. In contrast, achieving registration from unlabeled point cloud data without ground truth counterparts is a significant challenge. PCRNet alleviates the pose bias shown in PointNetLK by replacing the Lucas-Kanade module with a multilayer perceptron. PCRNet recovers transformation parameters directly from concatenated global descriptors of source and target point clouds. FMR-Net implements an unsupervised framework with an encoder-decoder task, while enabling registration by minimizing feature metric projection errors. Although these methods exhibit more prominent advantages of unsupervised learning, they mainly rely on the deep features of global representation and ignore the deep features of local representation. Therefore, the depth features of the point cloud are not fully utilized for registration, and the registration effect is not perfect.

发明内容SUMMARY OF THE INVENTION

本发明的目的是针对现有技术中存在的不足，而提供一种基于深度特征进行鲁棒点云配准的无监督方法。这种方法是一种无监督网络，通过结合全局和局部的高级特征来学习提取深度特征，以无监督的方式训练配准框架，并且这个方法不需要对点对应进行昂贵的计算，在精度、初始化鲁棒性和计算效率方面具有很大的优势。The purpose of the present invention is to provide an unsupervised method for robust point cloud registration based on depth features, aiming at the deficiencies in the prior art. This method is an unsupervised network that learns to extract deep features by combining global and local high-level features, trains the registration framework in an unsupervised manner, and this method does not require expensive computation of point correspondences, in terms of accuracy, There are great advantages in initialization robustness and computational efficiency.

实现本发明目的的技术方案是：The technical scheme that realizes the object of the present invention is:

一种基于深度特征进行鲁棒点云配准的无监督方法，包括如下步骤：An unsupervised method for robust point cloud registration based on depth features, including the following steps:

1)获取点云数据：获取待配准的点云数据，在点云表面均匀采样1024个点数据；1) Obtain point cloud data: obtain the point cloud data to be registered, and evenly sample 1024 point data on the point cloud surface;

2)转换：将得到的两幅点云的采样数据类型转换为张量，张量大小为1024×3，然后输入到深度学习框架中；2) Conversion: Convert the sampled data types of the obtained two point clouds into tensors, the size of which is 1024×3, and then input them into the deep learning framework;

3)特征提取：深度学习框架中编码器模块对输入的张量进行点云深度特征提取，最终输出表示点云深度特征的一维张量，具体过程如下：3) Feature extraction: The encoder module in the deep learning framework performs point cloud depth feature extraction on the input tensor, and finally outputs a one-dimensional tensor representing the point cloud depth feature. The specific process is as follows:

3-1)对输入的张量1024×3经过EdgeConv模块，将每一点作为中心点来表征其与各个邻点的边特征，再将这些特征聚合从而获得该点的新表征，即通过构建每个顶点的领域获取点云的局部特征，具体步骤为：3-1) The input tensor 1024×3 passes through the EdgeConv module, and uses each point as the center point to represent the edge features between it and each neighboring point, and then aggregates these features to obtain a new representation of the point, that is, by constructing each point. The field of each vertex obtains the local features of the point cloud, and the specific steps are:

h_θ(x_i,x_j)＝h_θ(x_i,x_j-x_i) (1)，h _θ (x _i ,x _j )=h _θ (x _i ,x _j -x _i ) (1),

其中x_i在顶点集合X＝{x₁,...x_n}∈R^F中，F表示神经网络某一层输出的点的特征空间维度信息，接着再送入一个感知机得到边特征：Among them, x _i is in the vertex set X={x ₁ ,...x _n }∈RF, and ^F represents the feature space dimension information of the point output by a certain layer of the neural network, and then sends it into a perceptron to obtain the edge feature:

e'_ijm＝RELU(θ_m·(x_j-x_i)+φ_m·x_i) (2)，e' _ijm =RELU(θ _m ·(x _j -x _i )+φ _m ·x _i ) (2),

其中：in:

Θ＝(θ₁，...，θ_M，φ₁...，φ_M) (3)，Θ=(θ ₁ , ..., θ _M , φ ₁ ..., φ _M ) (3),

Θ为可学习参数，接着聚合各邻边的特征：Θ is a learnable parameter, and then aggregates the features of each adjacent edge:

其中

表示聚合操作，Ω表示以点x_i为中心的构成邻边的点对集合，具体聚合操作为：in

represents the aggregation operation, and Ω represents the set of point pairs forming adjacent edges centered on the point x _i . The specific aggregation operation is:

最后在EdgeConv模块输出张量的大小为6×1024×20并输入下一卷积层；Finally, the size of the output tensor in the EdgeConv module is 6×1024×20 and input to the next convolutional layer;

3-2)对得到局部特征的张量进行五次卷积，第一次卷积经过64个二维大小为1的卷积核的卷积层输出张量大小为64×1024×20；3-2) Perform five convolutions on the tensors obtained with local features. The first convolution passes through 64 convolutional layers with two-dimensional convolution kernels of size 1. The output tensor size is 64×1024×20;

3-3)受注意力机制对深度学习效果提升显著的特点启发，将第一层的输出结果接入CBAM注意力模块，对各通道以及空间的重要性进行加权，并同理依次经过64，128，256大小的卷积层；3-3) Inspired by the feature that the attention mechanism significantly improves the effect of deep learning, the output of the first layer is connected to the CBAM attention module, and the importance of each channel and space is weighted, and the same goes through 64, 128, 256 size convolutional layers;

3-4)将前四层的输出张量拼接且修改张量大小为512×1024×1，再进入512个二维大小为1的卷积核进行卷积并修改维度得到张量大小为512×1024，使用flatten函数将张量打平输出张量大小为512，由此得到包含了局部与全局描述符的深度特征；3-4) Splicing the output tensors of the first four layers and modifying the size of the tensor to 512×1024×1, then entering 512 convolution kernels with a two-dimensional size of 1 for convolution and modifying the dimension to obtain a tensor size of 512 ×1024, use the flatten function to flatten the tensor and the output tensor size is 512, thus obtaining deep features including local and global descriptors;

4)点云配准：对两幅点云进行配准，即求出一个刚性变换矩阵G∈SE(3)，分为旋转矩阵R∈SO(3)与平移矩阵t∈R³，然后求解出最小化目标函数F(G(R,t))：4) Point cloud registration: To register two point clouds, that is, to obtain a rigid transformation matrix G∈SE(3), which is divided into a rotation matrix R∈SO(3) and a translation matrix t∈R ³ , and then solve To minimize the objective function F(G(R,t)):

其中ψ:R^3×N→R^K为encoder学习到的特征提取函数，k是特征维度；Where ψ:R ^3×N →R ^K is the feature extraction function learned by the encoder, and k is the feature dimension;

4-1)G(R,t)看作一个特殊李群，用指数映射表示如下：4-1) G(R, t) is regarded as a special Lie group, which is represented by exponential mapping as follows:

其中，T是指数映射G的生成器，ε是一个李代数，通过指数映射到变换矩阵G上；Among them, T is the generator of the exponential mapping G, ε is a Lie algebra, which is mapped to the transformation matrix G through the exponential;

4-2)点云配准问题描述为φ(PT)＝φ(G·PT)，求解矩阵G即为通过对李代数ε求导得到G的导数，从而直接调整李代数ε来间接的优化最佳变换矩阵G和最小化两点云的特征空间投影误差进行点云配准任务，受(IC)公式的启发，先对φ(PT)＝φ(G·PT)进行逆变换并将其右侧线性化：4-2) The point cloud registration problem is described as φ(PT)=φ(G·PT), and solving the matrix G is to obtain the derivative of G by taking the derivative of the Lie algebra ε, so as to directly adjust the Lie algebra ε for indirect optimization The optimal transformation matrix G and the minimization of the feature space projection error of the two point clouds are used for the point cloud registration task. Inspired by the (IC) formula, the inverse transformation of φ(PT)=φ(G·PT) is first performed and its Linearization on the right:

4-3)转换估计迭代运行计算增量ε，通过运行逆合成(IC)算法来估计每个步骤的ε：4-3) Transformation Estimation Iteratively runs the computation increment ε, by running the inverse synthesis (IC) algorithm to estimate ε for each step:

ε＝(J^TJ)^-1(J^Tδ) (9)，ε = (J ^T J) ^-1 (J ^T δ) (9),

其中δ＝φ(P_S)-φ(P_T)为两点云特征空间投影误差,

是投影误差δ相对于变换参数ε的雅可比矩阵；where δ=φ(P _S )-φ(P _T ) is the projection error of the feature space of the two point clouds,

is the Jacobian matrix of the projection error δ relative to the transformation parameter ε;

4-4)为了有效的计算雅克比矩阵，采用了不同的有限梯度替代传统的随机梯度法来计算雅克比矩阵：4-4) In order to effectively calculate the Jacobian matrix, different finite gradients are used to replace the traditional stochastic gradient method to calculate the Jacobian matrix:

其中t_i为计算期间变换参数的无穷小扰动，当t_i设定为固定值时会得到比较好的效果，为t_i设置了三个角参数用于旋转和三个扰动参数用于平移，将t_i大小设置为2*e^-2；Among them, t _i is the infinitesimal disturbance of the transformation parameters during the calculation. When t _i is set to a fixed value, a better effect will be obtained. For t _i , three angle parameters are set for rotation and three disturbance parameters are used for translation. t _i size is set to 2*e ^-2 ;

4-5)通过迭代运算公式(9),对变换矩阵进行优化缩短源点云在特征空间上与样板点云的投影距离：4-5) Through the iterative operation formula (9), the transformation matrix is optimized to shorten the projection distance between the source point cloud and the template point cloud in the feature space:

ΔG·P_S→P_S (11)， _ΔG ·PS→ _PS (11),

其中

为公式(9)计算出变换参数ε得到的变换矩阵，经过多次迭代，不断变换P_S与P_T配准，最终输出最佳变换矩阵G_est：in

Calculate the transformation matrix obtained by the transformation parameter ε for formula (9). After many iterations, the registration of P _S and P _T is continuously transformed, and the optimal transformation matrix G _est is finally output:

G_est＝ΔG_n·...·ΔG₁ΔG₀ (12)，G _est =ΔG _n ··· ΔG ₁ ΔG ₀ (12),

5)训练：为了实现该深度学习框架以无监督的方式进行训练，引入了一个由四个全连接层组成的解码器；同时选择了ReLU作为前三层的激活函数，tanh函数为第四层的激活函数；5) Training: In order to realize the training of the deep learning framework in an unsupervised manner, a decoder consisting of four fully connected layers is introduced; at the same time, ReLU is selected as the activation function of the first three layers, and the tanh function is the fourth layer. The activation function of ;

5-1)在无监督的情况下，利用变换后的源点云和目标点云之间的对齐误差而不是地面真值变换来进行模型训练，采用一个鲁棒的损失函数：5-1) In the unsupervised case, use the alignment error between the transformed source point cloud and the target point cloud instead of the ground truth transformation for model training, using a robust loss function:

其中，P_T∈Θ是从单位平方[0，1]²中采样的一组点，x是点云特征，

是卷积层参数中的第i个分量，Φ是原始输入的三维点。where P _T ∈ Θ is a set of points sampled from the unit square [0, 1] ² , x is the point cloud feature,

is the ith component in the parameters of the convolutional layer, and Φ is the 3D point of the original input.

这种方法精度、计算效率高，初始化鲁棒性强，不需要对点对应进行昂贵的计算。This method has high accuracy, high computational efficiency, strong initialization robustness, and does not require expensive computation of point correspondences.

附图说明Description of drawings

图1为实施例中点云配准网络架构示意图；1 is a schematic diagram of a point cloud registration network architecture in an embodiment;

图2为实施例进行瓶子的三维点云配准的示意与效果图；Fig. 2 is the schematic diagram and effect diagram of carrying out the three-dimensional point cloud registration of bottle in the embodiment;

图3为实施例进行马桶的三维点云配准的示意与效果图；3 is a schematic diagram and an effect diagram of three-dimensional point cloud registration of a toilet in an embodiment;

图4为实施例进行有噪声手机的三维点云配准的示意与效果图；4 is a schematic diagram and an effect diagram of performing three-dimensional point cloud registration of a noisy mobile phone according to an embodiment;

图5为实施例进行真实室内场景的三维点云配准的示意与效果图；5 is a schematic diagram and an effect diagram of performing three-dimensional point cloud registration of a real indoor scene according to an embodiment;

图6为实施例进行斯坦福3DMatch中Armadillo的点云配准的示意与效果图。FIG. 6 is a schematic diagram and an effect diagram of the point cloud registration of Armadillo in Stanford 3DMatch according to the embodiment.

具体实施方式Detailed ways

下面结合附图及具体实施例对发明作进一步的详细描述，但不是对本发明的限定。The invention will be further described in detail below with reference to the accompanying drawings and specific embodiments, but it is not intended to limit the invention.

实施例：Example:

参照图1，一种基于深度特征进行鲁棒点云配准的无监督方法，包括如下步骤：Referring to Figure 1, an unsupervised method for robust point cloud registration based on depth features includes the following steps:

2)转换：将得到的两幅点云的采样数据类型转换为张量，大小为1024×3，然后输入到深度学习框架中；2) Conversion: Convert the sampled data types of the obtained two point clouds to tensors with a size of 1024×3, and then input them into the deep learning framework;

3-1)与其他依靠全局描述符的无监督方法不同的是，本例方法再无监督训练方式的基础上继续关注局部特征，对输入的张量1024×3经过EdgeConv模块，将每一点作为中心点来表征其与各个邻点的边特征，再将这些特征聚合从而获得该点的新表征，即通过构建每个顶点的领域获取点云的局部特征，具体步骤为：3-1) Unlike other unsupervised methods that rely on global descriptors, the method in this example continues to focus on local features based on the unsupervised training method. The input tensor 1024×3 passes through the EdgeConv module, and uses each point as The center point is used to characterize the edge features between it and each neighboring point, and then these features are aggregated to obtain a new representation of the point, that is, the local features of the point cloud are obtained by constructing the field of each vertex. The specific steps are:

其中：in:

其中

ε＝(J^TJ)^-1(J^Tδ) (9)，ε = (J ^T J) ^-1 (J ^T δ) (9),

其中δ＝φ(P_S)-φ(P_T)为两点云特征空间投影误差,

是投影误差δ相对于变换where δ=φ(P _S )-φ(P _T ) is the projection error of the feature space of the two point clouds,

is the projection error δ with respect to the transformation

参数ε的雅可比矩阵；the Jacobian of the parameter ε;

ΔG·P_S→P_S (11)， _ΔG ·PS→ _PS (11),

其中

5)训练：为了实现该深度学习框架以无监督的方式进行训练，引入了一个由四个全连接层组成的解码器；同时选择了ReLU作为前三层的激活函数，tanh函数为第四层的激活函数，层数如图1所示；5) Training: In order to realize the training of the deep learning framework in an unsupervised manner, a decoder consisting of four fully connected layers is introduced; at the same time, ReLU is selected as the activation function of the first three layers, and the tanh function is the fourth layer. The activation function of , the number of layers is shown in Figure 1;

为了说明该深度学习框架的有效性，采用应用最为广泛的ModelNet40作为预训练集，ModelNet40是一个具有12311个CAD模型且包含40个对象类别的数据集，将该深度学习框架在ModelNet40前20类别进行训练，并在后20个类别进行测试。在训练与测试过程中，通过随机生成刚性变换矩阵G得到地面真值变换矩阵G_gt，同时使得源点云经过G生成目标点云。任意选择轴，将G中的旋转分量取值在[0,45]度范围内，平移分量在[-0.5,0.5]范围内初始化，以真实旋转平移矩阵G_gt和预测旋转平移矩阵G_est之间的旋转分量与平移分量的均方误差(Mean Squared Error，简称MSE)，均方根误差(Root Mean Squard Error，简称RMSE)和绝对平均误差(Mean Absolute Error，简称MAE)作为评价指标，他们的值越小代表配准的精度越高。同时实验中角度的度量单位为度，运行配准时间的单位为秒。结果如下表所示：In order to illustrate the effectiveness of the deep learning framework, the most widely used ModelNet40 is used as the pre-training set. ModelNet40 is a dataset with 12311 CAD models and 40 object categories. The deep learning framework is used in the top 20 categories of ModelNet40. Train and test on the last 20 classes. In the process of training and testing, the ground truth transformation matrix G _gt is obtained by randomly generating the rigid transformation matrix G, and at the same time, the source point cloud passes through G to generate the target point cloud. Select the axis arbitrarily, set the rotation component in G in the range of [0, 45] degrees, and initialize the translation component in the range of [-0.5, 0.5], with the actual rotation and translation matrix G _gt and predicted rotation and translation matrix G _est . The mean squared error (Mean Squared Error, referred to as MSE), the root mean squared error (Root Mean Squared Error, referred to as RMSE) and the absolute mean error (Mean Absolute Error, referred to as MAE) of the rotation and translation components between the two are used as evaluation indicators. The smaller the value is, the higher the registration accuracy is. At the same time, the unit of measurement of the angle in the experiment is degrees, and the unit of running registration time is seconds. The results are shown in the following table:

具体实例如图2-图6所示。Specific examples are shown in Figures 2-6.

Claims

1. an unsupervised method for robust point cloud registration based on depth features, is characterized in that, comprises the steps:

1) Obtain point cloud data: obtain the point cloud data to be registered, and evenly sample 1024 point data on the point cloud surface;

2) Conversion: Convert the sampled data types of the obtained two point clouds to tensors with a size of 1024×3, and then input them into the deep learning framework;

3) Feature extraction: The encoder module in the deep learning framework performs point cloud depth feature extraction on the input tensor, and finally outputs a one-dimensional tensor representing the point cloud depth feature. The specific process is as follows:

3-1) The input tensor 1024×3 passes through the EdgeConv module, and uses each point as the center point to represent the edge features between it and each neighboring point, and then aggregates these features to obtain a new representation of the point, that is, by constructing each point. The field of each vertex obtains the local features of the point cloud, and the specific steps are:

h _θ (x _i ,x _j )=h _θ (x _i ,x _j -x _i ) (1),

Among them, x _i is in the vertex set X={x ₁ ,...x _n }∈RF, and ^F represents the feature space dimension information of the point output by a certain layer of the neural network, and then sends it into a perceptron to obtain the edge feature:

e' _ijm =RELU(θ _m ·(x _j -x _i )+φ _m ·x _i ) (2),

in:

Θ=(θ ₁ , ..., θ _M , φ ₁ ..., φ _M ) (3),

Θ is a learnable parameter, and then aggregates the features of each adjacent edge:

in

represents the aggregation operation, and Ω represents the set of point pairs forming adjacent edges with the point _xi as the center. The specific aggregation operation is:

Finally, the size of the output tensor in the EdgeConv module is 6×1024×20 and input to the next convolutional layer;

3-2) Perform five convolutions on the tensors obtained with local features. The first convolution passes through 64 convolutional layers with two-dimensional convolution kernels of size 1. The output tensor size is 64×1024×20;

3-3) Inspired by the feature that the attention mechanism significantly improves the effect of deep learning, the output of the first layer is connected to the CBAM attention module, and the importance of each channel and space is weighted, and the same goes through 64, 128, 256 size convolutional layers;

3-4) Splicing the output tensors of the first four layers and modifying the size of the tensor to 512×1024×1, then entering 512 convolution kernels with a two-dimensional size of 1 for convolution and modifying the dimension to obtain a tensor size of 512 ×1024, use the flatten function to flatten the tensor and the output tensor size is 512, thus obtaining deep features including local and global descriptors;

4) Point cloud registration: To register two point clouds, that is, to obtain a rigid transformation matrix G∈SE(3), which is divided into a rotation matrix R∈SO(3) and a translation matrix t∈R ³ , and then solve To minimize the objective function F(G(R,t)):

Where ψ:R ^3×N →R ^K is the feature extraction function learned by the encoder, and k is the feature dimension;

4-1) G(R, t) is regarded as a special Lie group, which is represented by exponential mapping as follows:

Among them, T is the generator of the exponential mapping G, ε is a Lie algebra, which is mapped to the transformation matrix G through the exponential;

4-2) The point cloud registration problem is described as φ(PT)=φ(G·PT). To solve the matrix G, the derivative of G is obtained by taking the derivation of the Lie algebra ε, so as to directly adjust the Lie algebra ε to indirectly optimize the optimum. To optimize the transformation matrix G and minimize the feature space projection error of the two point clouds to perform the point cloud registration task, inspired by the (IC) formula, first inverse transform φ(PT)=φ(G PT) and right it Lateral linearization:

4-3) Transformation Estimation Iteratively runs the computation increment ε, by running the inverse synthesis (IC) algorithm to estimate ε for each step:

ε = (J ^T J) ^-1 (J ^T δ) (9),

where δ=φ(P _S )-φ(P _T ) is the projection error of the feature space of the two point clouds,

4-4) In order to effectively calculate the Jacobian matrix, different finite gradients are used to replace the traditional stochastic gradient method to calculate the Jacobian matrix:

Among them, t _i is the infinitesimal disturbance of the transformation parameters during the calculation. When t _i is set to a fixed value, a better effect will be obtained. For t _i , three angle parameters are set for rotation and three disturbance parameters are used for translation. t _i size is set to 2*e ^-2 ;

4-5) Through the iterative operation formula (9), the transformation matrix is optimized to shorten the projection distance between the source point cloud and the template point cloud in the feature space:

_ΔG ·PS→ _PS (11),

in

G _est =ΔG _n ··· ΔG ₁ ΔG ₀ (12),

5) Training: In order to realize the training of the deep learning framework in an unsupervised manner, a decoder consisting of four fully connected layers is introduced; at the same time, ReLU is selected as the activation function of the first three layers, and the tanh function is the fourth layer. The activation function of ;

5-1) In the unsupervised case, use the alignment error between the transformed source point cloud and the target point cloud instead of the ground truth transformation for model training, using a robust loss function:

where P _T ∈ Θ is a set of points sampled from the unit square [0, 1] ² , x is the point cloud feature,