CN115170626A - Unsupervised method for robust point cloud registration based on depth features - Google Patents

Unsupervised method for robust point cloud registration based on depth features Download PDF

Info

Publication number
CN115170626A
CN115170626A CN202210847039.1A CN202210847039A CN115170626A CN 115170626 A CN115170626 A CN 115170626A CN 202210847039 A CN202210847039 A CN 202210847039A CN 115170626 A CN115170626 A CN 115170626A
Authority
CN
China
Prior art keywords
point cloud
point
size
tensor
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210847039.1A
Other languages
Chinese (zh)
Inventor
陈明
韦升喜
肖远辉
田旭
李祺峰
吴冬柳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi Normal University
Original Assignee
Guangxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi Normal University filed Critical Guangxi Normal University
Priority to CN202210847039.1A priority Critical patent/CN115170626A/en
Publication of CN115170626A publication Critical patent/CN115170626A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an unsupervised method for carrying out robust point cloud registration based on depth characteristics, which comprises the following steps: 1) Acquiring point cloud data; 2) Converting; 3) Extracting characteristics; 4) Point cloud registration; 5) And (5) training. The method is an unsupervised network, the registration framework is trained in an unsupervised mode by combining global and local high-level features to learn and extract the deep features, expensive calculation is not needed to be carried out on point correspondence, and great advantages are achieved in the aspects of precision, initialization robustness and calculation efficiency.

Description

一种基于深度特征进行鲁棒点云配准的无监督方法An unsupervised method for robust point cloud registration based on deep features

技术领域technical field

本发明涉及物体三维重建和定位,具体是一种基于深度特征进行鲁棒点云配准的无监督方法。The invention relates to three-dimensional reconstruction and positioning of objects, in particular to an unsupervised method for robust point cloud registration based on depth features.

背景技术Background technique

点云配准是估计两个点云对齐的刚性变换的问题。它在自动驾驶、运动和姿态估计、三维重建、同时定位和映射(SLAM)以及增强现实等各个领域都有许多应用。Point cloud registration is the problem of estimating a rigid transformation that aligns two point clouds. It has many applications in various fields such as autonomous driving, motion and pose estimation, 3D reconstruction, simultaneous localization and mapping (SLAM), and augmented reality.

最近,一些基于深度学习(deep learning,简称DL)的方法被提出来处理大旋转角度。(Y.Wang and J.M.Solomon,“Deep closest point:Learning representations forpoint cloud registration,”in Proceedings of the IEEE International Conferenceon Computer Vision(ICCV),2019,pp.3523–3532.)(Y.Aoki,H.Goforth,R.A.Srivatsan,and S.Lucey,“Pointnetlk:Robust&efficient point cloud registration usingpointnet,”in Proceedings of the IEEE Conference on Computer Vision andPattern Recognition(CVPR),2019,pp.7163–7172)(V.Sarode,X.Li,H.Goforth,Y.Aoki,R.A.Srivatsan,S.Lucey,and H.Choset,“Pcrnet:point cloud registration networkusing pointnet encoding,”arXiv preprint arXiv:1908.07906,2019.)(X.Huang,G.Mei,and J.Zhang,“Feature-metric registration:A fast semi-supervisedapproach for robust point cloud registration without correspondences,”inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2020,pp.11 366–11 374.)Recently, some deep learning (DL) based methods have been proposed to deal with large rotation angles. (Y. Wang and J.M. Solomon, “Deep closest point: Learning representations for point cloud registration,” in Proceedings of the IEEE International Conferenceon Computer Vision (ICCV), 2019, pp. 3523–3532.) (Y. Aoki, H. Goforth , R.A. Srivatsan, and S. Lucey, "Pointnetlk: Robust&efficient point cloud registration using pointnet," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 7163–7172) (V. Sarode, X. Li , H. Goforth, Y. Aoki, R. A. Srivatsan, S. Lucey, and H. Choset, "Pcrnet: point cloud registration network using pointnet encoding," arXiv preprint arXiv: 1908.07906, 2019.) (X. Huang, G.Mei, and J. Zhang, “Feature-metric registration: A fast semi-supervised approach for robust point cloud registration without correspondences,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2020, pp.11 366–11 374. )

粗略地说,它们可以分为两类:依赖于基本事实对应或类别标签的监督方法和非监督方法。深度最近点(Deep nearest Point,简称DCP)通过奇异值分解(singular valuedecomposition,简称SVD)计算刚性变换,其中通过学习软匹配映射构建对应关系。PointNetLK使用经典的对齐技术,如Lucas-Kanade(LK)算法来对齐PointNet特征,对训练中看不到的形状产生良好的泛化能力。然而,它们依赖于大量的配准标签数据,这使得该算法不实用,因为3D配准标签非常消耗劳动力。相比之下,从没有地面真实对应的未标记点云数据实现配准是一个重大的挑战。PCRNet通过用多层感知器替换Lucas-Kanade模块,缓解了PointNetLK中所示的姿态偏差。PCRNet直接从源点云和目标点云的串联全局描述符中恢复转换参数。FMR-Net采用编码器-解码器任务实现无监督框架,同时通过最小化特征度量投影误差来实现配准。虽然这些方法展示了无监督学习的更突出的优势,但它们主要依赖全局表示的深度特征,而忽略了局部表示的深度特征。从而没有完全充分利用点云的深度特征进行配准,没有让配准效果达到完美。Roughly speaking, they can be divided into two categories: supervised methods that rely on ground-truth correspondences or class labels, and unsupervised methods. The deep nearest point (DCP for short) calculates the rigid transformation through singular value decomposition (SVD for short), in which the correspondence is constructed by learning the soft matching map. PointNetLK uses classical alignment techniques such as the Lucas-Kanade (LK) algorithm to align PointNet features, yielding good generalization ability to shapes not seen during training. However, they rely on a large amount of registration label data, which makes the algorithm impractical because 3D registration labels are very labor-intensive. In contrast, achieving registration from unlabeled point cloud data without ground truth counterparts is a significant challenge. PCRNet alleviates the pose bias shown in PointNetLK by replacing the Lucas-Kanade module with a multilayer perceptron. PCRNet recovers transformation parameters directly from concatenated global descriptors of source and target point clouds. FMR-Net implements an unsupervised framework with an encoder-decoder task, while enabling registration by minimizing feature metric projection errors. Although these methods exhibit more prominent advantages of unsupervised learning, they mainly rely on the deep features of global representation and ignore the deep features of local representation. Therefore, the depth features of the point cloud are not fully utilized for registration, and the registration effect is not perfect.

发明内容SUMMARY OF THE INVENTION

本发明的目的是针对现有技术中存在的不足,而提供一种基于深度特征进行鲁棒点云配准的无监督方法。这种方法是一种无监督网络,通过结合全局和局部的高级特征来学习提取深度特征,以无监督的方式训练配准框架,并且这个方法不需要对点对应进行昂贵的计算,在精度、初始化鲁棒性和计算效率方面具有很大的优势。The purpose of the present invention is to provide an unsupervised method for robust point cloud registration based on depth features, aiming at the deficiencies in the prior art. This method is an unsupervised network that learns to extract deep features by combining global and local high-level features, trains the registration framework in an unsupervised manner, and this method does not require expensive computation of point correspondences, in terms of accuracy, There are great advantages in initialization robustness and computational efficiency.

实现本发明目的的技术方案是:The technical scheme that realizes the object of the present invention is:

一种基于深度特征进行鲁棒点云配准的无监督方法,包括如下步骤:An unsupervised method for robust point cloud registration based on depth features, including the following steps:

1)获取点云数据:获取待配准的点云数据,在点云表面均匀采样1024个点数据;1) Obtain point cloud data: obtain the point cloud data to be registered, and evenly sample 1024 point data on the point cloud surface;

2)转换:将得到的两幅点云的采样数据类型转换为张量,张量大小为1024×3,然后输入到深度学习框架中;2) Conversion: Convert the sampled data types of the obtained two point clouds into tensors, the size of which is 1024×3, and then input them into the deep learning framework;

3)特征提取:深度学习框架中编码器模块对输入的张量进行点云深度特征提取,最终输出表示点云深度特征的一维张量,具体过程如下:3) Feature extraction: The encoder module in the deep learning framework performs point cloud depth feature extraction on the input tensor, and finally outputs a one-dimensional tensor representing the point cloud depth feature. The specific process is as follows:

3-1)对输入的张量1024×3经过EdgeConv模块,将每一点作为中心点来表征其与各个邻点的边特征,再将这些特征聚合从而获得该点的新表征,即通过构建每个顶点的领域获取点云的局部特征,具体步骤为:3-1) The input tensor 1024×3 passes through the EdgeConv module, and uses each point as the center point to represent the edge features between it and each neighboring point, and then aggregates these features to obtain a new representation of the point, that is, by constructing each point. The field of each vertex obtains the local features of the point cloud, and the specific steps are:

hθ(xi,xj)=hθ(xi,xj-xi) (1),h θ (x i ,x j )=h θ (x i ,x j -x i ) (1),

其中xi在顶点集合X={x1,...xn}∈RF中,F表示神经网络某一层输出的点的特征空间维度信息,接着再送入一个感知机得到边特征:Among them, x i is in the vertex set X={x 1 ,...x n }∈RF, and F represents the feature space dimension information of the point output by a certain layer of the neural network, and then sends it into a perceptron to obtain the edge feature:

e'ijm=RELU(θm·(xj-xi)+φm·xi) (2),e' ijm =RELU(θ m ·(x j -x i )+φ m ·x i ) (2),

其中:in:

Θ=(θ1,...,θM,φ1...,φM) (3),Θ=(θ 1 , ..., θ M , φ 1 ..., φ M ) (3),

Θ为可学习参数,接着聚合各邻边的特征:Θ is a learnable parameter, and then aggregates the features of each adjacent edge:

Figure BDA0003735652960000031
Figure BDA0003735652960000031

其中

Figure BDA0003735652960000032
表示聚合操作,Ω表示以点xi为中心的构成邻边的点对集合,具体聚合操作为:in
Figure BDA0003735652960000032
represents the aggregation operation, and Ω represents the set of point pairs forming adjacent edges centered on the point x i . The specific aggregation operation is:

Figure BDA0003735652960000033
Figure BDA0003735652960000033

最后在EdgeConv模块输出张量的大小为6×1024×20并输入下一卷积层;Finally, the size of the output tensor in the EdgeConv module is 6×1024×20 and input to the next convolutional layer;

3-2)对得到局部特征的张量进行五次卷积,第一次卷积经过64个二维大小为1的卷积核的卷积层输出张量大小为64×1024×20;3-2) Perform five convolutions on the tensors obtained with local features. The first convolution passes through 64 convolutional layers with two-dimensional convolution kernels of size 1. The output tensor size is 64×1024×20;

3-3)受注意力机制对深度学习效果提升显著的特点启发,将第一层的输出结果接入CBAM注意力模块,对各通道以及空间的重要性进行加权,并同理依次经过64,128,256大小的卷积层;3-3) Inspired by the feature that the attention mechanism significantly improves the effect of deep learning, the output of the first layer is connected to the CBAM attention module, and the importance of each channel and space is weighted, and the same goes through 64, 128, 256 size convolutional layers;

3-4)将前四层的输出张量拼接且修改张量大小为512×1024×1,再进入512个二维大小为1的卷积核进行卷积并修改维度得到张量大小为512×1024,使用flatten函数将张量打平输出张量大小为512,由此得到包含了局部与全局描述符的深度特征;3-4) Splicing the output tensors of the first four layers and modifying the size of the tensor to 512×1024×1, then entering 512 convolution kernels with a two-dimensional size of 1 for convolution and modifying the dimension to obtain a tensor size of 512 ×1024, use the flatten function to flatten the tensor and the output tensor size is 512, thus obtaining deep features including local and global descriptors;

4)点云配准:对两幅点云进行配准,即求出一个刚性变换矩阵G∈SE(3),分为旋转矩阵R∈SO(3)与平移矩阵t∈R3,然后求解出最小化目标函数F(G(R,t)):4) Point cloud registration: To register two point clouds, that is, to obtain a rigid transformation matrix G∈SE(3), which is divided into a rotation matrix R∈SO(3) and a translation matrix t∈R 3 , and then solve To minimize the objective function F(G(R,t)):

Figure BDA0003735652960000034
Figure BDA0003735652960000034

其中ψ:R3×N→RK为encoder学习到的特征提取函数,k是特征维度;Where ψ:R 3×N →R K is the feature extraction function learned by the encoder, and k is the feature dimension;

4-1)G(R,t)看作一个特殊李群,用指数映射表示如下:4-1) G(R, t) is regarded as a special Lie group, which is represented by exponential mapping as follows:

Figure BDA0003735652960000035
Figure BDA0003735652960000035

其中,T是指数映射G的生成器,ε是一个李代数,通过指数映射到变换矩阵G上;Among them, T is the generator of the exponential mapping G, ε is a Lie algebra, which is mapped to the transformation matrix G through the exponential;

4-2)点云配准问题描述为φ(PT)=φ(G·PT),求解矩阵G即为通过对李代数ε求导得到G的导数,从而直接调整李代数ε来间接的优化最佳变换矩阵G和最小化两点云的特征空间投影误差进行点云配准任务,受(IC)公式的启发,先对φ(PT)=φ(G·PT)进行逆变换并将其右侧线性化:4-2) The point cloud registration problem is described as φ(PT)=φ(G·PT), and solving the matrix G is to obtain the derivative of G by taking the derivative of the Lie algebra ε, so as to directly adjust the Lie algebra ε for indirect optimization The optimal transformation matrix G and the minimization of the feature space projection error of the two point clouds are used for the point cloud registration task. Inspired by the (IC) formula, the inverse transformation of φ(PT)=φ(G·PT) is first performed and its Linearization on the right:

Figure BDA0003735652960000041
Figure BDA0003735652960000041

4-3)转换估计迭代运行计算增量ε,通过运行逆合成(IC)算法来估计每个步骤的ε:4-3) Transformation Estimation Iteratively runs the computation increment ε, by running the inverse synthesis (IC) algorithm to estimate ε for each step:

ε=(JTJ)-1(JTδ) (9),ε = (J T J) -1 (J T δ) (9),

其中δ=φ(PS)-φ(PT)为两点云特征空间投影误差,

Figure BDA0003735652960000042
是投影误差δ相对于变换参数ε的雅可比矩阵;where δ=φ(P S )-φ(P T ) is the projection error of the feature space of the two point clouds,
Figure BDA0003735652960000042
is the Jacobian matrix of the projection error δ relative to the transformation parameter ε;

4-4)为了有效的计算雅克比矩阵,采用了不同的有限梯度替代传统的随机梯度法来计算雅克比矩阵:4-4) In order to effectively calculate the Jacobian matrix, different finite gradients are used to replace the traditional stochastic gradient method to calculate the Jacobian matrix:

Figure BDA0003735652960000043
Figure BDA0003735652960000043

其中ti为计算期间变换参数的无穷小扰动,当ti设定为固定值时会得到比较好的效果,为ti设置了三个角参数用于旋转和三个扰动参数用于平移,将ti大小设置为2*e-2Among them, t i is the infinitesimal disturbance of the transformation parameters during the calculation. When t i is set to a fixed value, a better effect will be obtained. For t i , three angle parameters are set for rotation and three disturbance parameters are used for translation. t i size is set to 2*e -2 ;

4-5)通过迭代运算公式(9),对变换矩阵进行优化缩短源点云在特征空间上与样板点云的投影距离:4-5) Through the iterative operation formula (9), the transformation matrix is optimized to shorten the projection distance between the source point cloud and the template point cloud in the feature space:

ΔG·PS→PS (11), ΔG ·PS→ PS (11),

其中

Figure BDA0003735652960000044
为公式(9)计算出变换参数ε得到的变换矩阵,经过多次迭代,不断变换PS与PT配准,最终输出最佳变换矩阵Gest:in
Figure BDA0003735652960000044
Calculate the transformation matrix obtained by the transformation parameter ε for formula (9). After many iterations, the registration of P S and P T is continuously transformed, and the optimal transformation matrix G est is finally output:

Gest=ΔGn·...·ΔG1ΔG0 (12),G est =ΔG n ··· ΔG 1 ΔG 0 (12),

5)训练:为了实现该深度学习框架以无监督的方式进行训练,引入了一个由四个全连接层组成的解码器;同时选择了ReLU作为前三层的激活函数,tanh函数为第四层的激活函数;5) Training: In order to realize the training of the deep learning framework in an unsupervised manner, a decoder consisting of four fully connected layers is introduced; at the same time, ReLU is selected as the activation function of the first three layers, and the tanh function is the fourth layer. The activation function of ;

5-1)在无监督的情况下,利用变换后的源点云和目标点云之间的对齐误差而不是地面真值变换来进行模型训练,采用一个鲁棒的损失函数:5-1) In the unsupervised case, use the alignment error between the transformed source point cloud and the target point cloud instead of the ground truth transformation for model training, using a robust loss function:

Figure BDA0003735652960000051
Figure BDA0003735652960000051

其中,PT∈Θ是从单位平方[0,1]2中采样的一组点,x是点云特征,

Figure BDA0003735652960000052
是卷积层参数中的第i个分量,Φ是原始输入的三维点。where P T ∈ Θ is a set of points sampled from the unit square [0, 1] 2 , x is the point cloud feature,
Figure BDA0003735652960000052
is the ith component in the parameters of the convolutional layer, and Φ is the 3D point of the original input.

这种方法精度、计算效率高,初始化鲁棒性强,不需要对点对应进行昂贵的计算。This method has high accuracy, high computational efficiency, strong initialization robustness, and does not require expensive computation of point correspondences.

附图说明Description of drawings

图1为实施例中点云配准网络架构示意图;1 is a schematic diagram of a point cloud registration network architecture in an embodiment;

图2为实施例进行瓶子的三维点云配准的示意与效果图;Fig. 2 is the schematic diagram and effect diagram of carrying out the three-dimensional point cloud registration of bottle in the embodiment;

图3为实施例进行马桶的三维点云配准的示意与效果图;3 is a schematic diagram and an effect diagram of three-dimensional point cloud registration of a toilet in an embodiment;

图4为实施例进行有噪声手机的三维点云配准的示意与效果图;4 is a schematic diagram and an effect diagram of performing three-dimensional point cloud registration of a noisy mobile phone according to an embodiment;

图5为实施例进行真实室内场景的三维点云配准的示意与效果图;5 is a schematic diagram and an effect diagram of performing three-dimensional point cloud registration of a real indoor scene according to an embodiment;

图6为实施例进行斯坦福3DMatch中Armadillo的点云配准的示意与效果图。FIG. 6 is a schematic diagram and an effect diagram of the point cloud registration of Armadillo in Stanford 3DMatch according to the embodiment.

具体实施方式Detailed ways

下面结合附图及具体实施例对发明作进一步的详细描述,但不是对本发明的限定。The invention will be further described in detail below with reference to the accompanying drawings and specific embodiments, but it is not intended to limit the invention.

实施例:Example:

参照图1,一种基于深度特征进行鲁棒点云配准的无监督方法,包括如下步骤:Referring to Figure 1, an unsupervised method for robust point cloud registration based on depth features includes the following steps:

1)获取点云数据:获取待配准的点云数据,在点云表面均匀采样1024个点数据;1) Obtain point cloud data: obtain the point cloud data to be registered, and evenly sample 1024 point data on the point cloud surface;

2)转换:将得到的两幅点云的采样数据类型转换为张量,大小为1024×3,然后输入到深度学习框架中;2) Conversion: Convert the sampled data types of the obtained two point clouds to tensors with a size of 1024×3, and then input them into the deep learning framework;

3)特征提取:深度学习框架中编码器模块对输入的张量进行点云深度特征提取,最终输出表示点云深度特征的一维张量,具体过程如下:3) Feature extraction: The encoder module in the deep learning framework performs point cloud depth feature extraction on the input tensor, and finally outputs a one-dimensional tensor representing the point cloud depth feature. The specific process is as follows:

3-1)与其他依靠全局描述符的无监督方法不同的是,本例方法再无监督训练方式的基础上继续关注局部特征,对输入的张量1024×3经过EdgeConv模块,将每一点作为中心点来表征其与各个邻点的边特征,再将这些特征聚合从而获得该点的新表征,即通过构建每个顶点的领域获取点云的局部特征,具体步骤为:3-1) Unlike other unsupervised methods that rely on global descriptors, the method in this example continues to focus on local features based on the unsupervised training method. The input tensor 1024×3 passes through the EdgeConv module, and uses each point as The center point is used to characterize the edge features between it and each neighboring point, and then these features are aggregated to obtain a new representation of the point, that is, the local features of the point cloud are obtained by constructing the field of each vertex. The specific steps are:

hθ(xi,xj)=hθ(xi,xj-xi) (1),h θ (x i ,x j )=h θ (x i ,x j -x i ) (1),

其中xi在顶点集合X={x1,...xn}∈RF中,F表示神经网络某一层输出的点的特征空间维度信息,接着再送入一个感知机得到边特征:Among them, x i is in the vertex set X={x 1 ,...x n }∈RF, and F represents the feature space dimension information of the point output by a certain layer of the neural network, and then sends it into a perceptron to obtain the edge feature:

e'ijm=RELU(θm·(xj-xi)+φm·xi) (2),e' ijm =RELU(θ m ·(x j -x i )+φ m ·x i ) (2),

其中:in:

Θ=(θ1,...,θM,φ1...,φM) (3),Θ=(θ 1 , ..., θ M , φ 1 ..., φ M ) (3),

Θ为可学习参数,接着聚合各邻边的特征:Θ is a learnable parameter, and then aggregates the features of each adjacent edge:

Figure BDA0003735652960000061
Figure BDA0003735652960000061

其中

Figure BDA0003735652960000062
表示聚合操作,Ω表示以点xi为中心的构成邻边的点对集合,具体聚合操作为:in
Figure BDA0003735652960000062
represents the aggregation operation, and Ω represents the set of point pairs forming adjacent edges centered on the point x i . The specific aggregation operation is:

Figure BDA0003735652960000063
Figure BDA0003735652960000063

最后在EdgeConv模块输出张量的大小为6×1024×20并输入下一卷积层;Finally, the size of the output tensor in the EdgeConv module is 6×1024×20 and input to the next convolutional layer;

3-2)对得到局部特征的张量进行五次卷积,第一次卷积经过64个二维大小为1的卷积核的卷积层输出张量大小为64×1024×20;3-2) Perform five convolutions on the tensors obtained with local features. The first convolution passes through 64 convolutional layers with two-dimensional convolution kernels of size 1. The output tensor size is 64×1024×20;

3-3)受注意力机制对深度学习效果提升显著的特点启发,将第一层的输出结果接入CBAM注意力模块,对各通道以及空间的重要性进行加权,并同理依次经过64,128,256大小的卷积层;3-3) Inspired by the feature that the attention mechanism significantly improves the effect of deep learning, the output of the first layer is connected to the CBAM attention module, and the importance of each channel and space is weighted, and the same goes through 64, 128, 256 size convolutional layers;

3-4)将前四层的输出张量拼接且修改张量大小为512×1024×1,再进入512个二维大小为1的卷积核进行卷积并修改维度得到张量大小为512×1024,使用flatten函数将张量打平输出张量大小为512,由此得到包含了局部与全局描述符的深度特征;3-4) Splicing the output tensors of the first four layers and modifying the size of the tensor to 512×1024×1, then entering 512 convolution kernels with a two-dimensional size of 1 for convolution and modifying the dimension to obtain a tensor size of 512 ×1024, use the flatten function to flatten the tensor and the output tensor size is 512, thus obtaining deep features including local and global descriptors;

4)点云配准:对两幅点云进行配准,即求出一个刚性变换矩阵G∈SE(3),分为旋转矩阵R∈SO(3)与平移矩阵t∈R3,然后求解出最小化目标函数F(G(R,t)):4) Point cloud registration: To register two point clouds, that is, to obtain a rigid transformation matrix G∈SE(3), which is divided into a rotation matrix R∈SO(3) and a translation matrix t∈R 3 , and then solve To minimize the objective function F(G(R,t)):

Figure BDA0003735652960000064
Figure BDA0003735652960000064

其中ψ:R3×N→RK为encoder学习到的特征提取函数,k是特征维度;Where ψ:R 3×N →R K is the feature extraction function learned by the encoder, and k is the feature dimension;

4-1)G(R,t)看作一个特殊李群,用指数映射表示如下:4-1) G(R, t) is regarded as a special Lie group, which is represented by exponential mapping as follows:

Figure BDA0003735652960000065
Figure BDA0003735652960000065

其中,T是指数映射G的生成器,ε是一个李代数,通过指数映射到变换矩阵G上;Among them, T is the generator of the exponential mapping G, ε is a Lie algebra, which is mapped to the transformation matrix G through the exponential;

4-2)点云配准问题描述为φ(PT)=φ(G·PT),求解矩阵G即为通过对李代数ε求导得到G的导数,从而直接调整李代数ε来间接的优化最佳变换矩阵G和最小化两点云的特征空间投影误差进行点云配准任务,受(IC)公式的启发,先对φ(PT)=φ(G·PT)进行逆变换并将其右侧线性化:4-2) The point cloud registration problem is described as φ(PT)=φ(G·PT), and solving the matrix G is to obtain the derivative of G by taking the derivative of the Lie algebra ε, so as to directly adjust the Lie algebra ε for indirect optimization The optimal transformation matrix G and the minimization of the feature space projection error of the two point clouds are used for the point cloud registration task. Inspired by the (IC) formula, the inverse transformation of φ(PT)=φ(G·PT) is first performed and its Linearization on the right:

Figure BDA0003735652960000071
Figure BDA0003735652960000071

4-3)转换估计迭代运行计算增量ε,通过运行逆合成(IC)算法来估计每个步骤的ε:4-3) Transformation Estimation Iteratively runs the computation increment ε, by running the inverse synthesis (IC) algorithm to estimate ε for each step:

ε=(JTJ)-1(JTδ) (9),ε = (J T J) -1 (J T δ) (9),

其中δ=φ(PS)-φ(PT)为两点云特征空间投影误差,

Figure BDA0003735652960000072
是投影误差δ相对于变换where δ=φ(P S )-φ(P T ) is the projection error of the feature space of the two point clouds,
Figure BDA0003735652960000072
is the projection error δ with respect to the transformation

参数ε的雅可比矩阵;the Jacobian of the parameter ε;

4-4)为了有效的计算雅克比矩阵,采用了不同的有限梯度替代传统的随机梯度法来计算雅克比矩阵:4-4) In order to effectively calculate the Jacobian matrix, different finite gradients are used to replace the traditional stochastic gradient method to calculate the Jacobian matrix:

Figure BDA0003735652960000073
Figure BDA0003735652960000073

其中ti为计算期间变换参数的无穷小扰动,当ti设定为固定值时会得到比较好的效果,为ti设置了三个角参数用于旋转和三个扰动参数用于平移,将ti大小设置为2*e-2Among them, t i is the infinitesimal disturbance of the transformation parameters during the calculation. When t i is set to a fixed value, a better effect will be obtained. For t i , three angle parameters are set for rotation and three disturbance parameters are used for translation. t i size is set to 2*e -2 ;

4-5)通过迭代运算公式(9),对变换矩阵进行优化缩短源点云在特征空间上与样板点云的投影距离:4-5) Through the iterative operation formula (9), the transformation matrix is optimized to shorten the projection distance between the source point cloud and the template point cloud in the feature space:

ΔG·PS→PS (11), ΔG ·PS→ PS (11),

其中

Figure BDA0003735652960000074
为公式(9)计算出变换参数ε得到的变换矩阵,经过多次迭代,不断变换PS与PT配准,最终输出最佳变换矩阵Gest:in
Figure BDA0003735652960000074
Calculate the transformation matrix obtained by the transformation parameter ε for formula (9). After many iterations, the registration of P S and P T is continuously transformed, and the optimal transformation matrix G est is finally output:

Gest=ΔGn·...·ΔG1ΔG0 (12),G est =ΔG n ··· ΔG 1 ΔG 0 (12),

5)训练:为了实现该深度学习框架以无监督的方式进行训练,引入了一个由四个全连接层组成的解码器;同时选择了ReLU作为前三层的激活函数,tanh函数为第四层的激活函数,层数如图1所示;5) Training: In order to realize the training of the deep learning framework in an unsupervised manner, a decoder consisting of four fully connected layers is introduced; at the same time, ReLU is selected as the activation function of the first three layers, and the tanh function is the fourth layer. The activation function of , the number of layers is shown in Figure 1;

5-1)在无监督的情况下,利用变换后的源点云和目标点云之间的对齐误差而不是地面真值变换来进行模型训练,采用一个鲁棒的损失函数:5-1) In the unsupervised case, use the alignment error between the transformed source point cloud and the target point cloud instead of the ground truth transformation for model training, using a robust loss function:

Figure BDA0003735652960000081
Figure BDA0003735652960000081

其中,PT∈Θ是从单位平方[0,1]2中采样的一组点,x是点云特征,

Figure BDA0003735652960000082
是卷积层参数中的第i个分量,Φ是原始输入的三维点。where P T ∈ Θ is a set of points sampled from the unit square [0, 1] 2 , x is the point cloud feature,
Figure BDA0003735652960000082
is the ith component in the parameters of the convolutional layer, and Φ is the 3D point of the original input.

为了说明该深度学习框架的有效性,采用应用最为广泛的ModelNet40作为预训练集,ModelNet40是一个具有12311个CAD模型且包含40个对象类别的数据集,将该深度学习框架在ModelNet40前20类别进行训练,并在后20个类别进行测试。在训练与测试过程中,通过随机生成刚性变换矩阵G得到地面真值变换矩阵Ggt,同时使得源点云经过G生成目标点云。任意选择轴,将G中的旋转分量取值在[0,45]度范围内,平移分量在[-0.5,0.5]范围内初始化,以真实旋转平移矩阵Ggt和预测旋转平移矩阵Gest之间的旋转分量与平移分量的均方误差(Mean Squared Error,简称MSE),均方根误差(Root Mean Squard Error,简称RMSE)和绝对平均误差(Mean Absolute Error,简称MAE)作为评价指标,他们的值越小代表配准的精度越高。同时实验中角度的度量单位为度,运行配准时间的单位为秒。结果如下表所示:In order to illustrate the effectiveness of the deep learning framework, the most widely used ModelNet40 is used as the pre-training set. ModelNet40 is a dataset with 12311 CAD models and 40 object categories. The deep learning framework is used in the top 20 categories of ModelNet40. Train and test on the last 20 classes. In the process of training and testing, the ground truth transformation matrix G gt is obtained by randomly generating the rigid transformation matrix G, and at the same time, the source point cloud passes through G to generate the target point cloud. Select the axis arbitrarily, set the rotation component in G in the range of [0, 45] degrees, and initialize the translation component in the range of [-0.5, 0.5], with the actual rotation and translation matrix G gt and predicted rotation and translation matrix G est . The mean squared error (Mean Squared Error, referred to as MSE), the root mean squared error (Root Mean Squared Error, referred to as RMSE) and the absolute mean error (Mean Absolute Error, referred to as MAE) of the rotation and translation components between the two are used as evaluation indicators. The smaller the value is, the higher the registration accuracy is. At the same time, the unit of measurement of the angle in the experiment is degrees, and the unit of running registration time is seconds. The results are shown in the following table:

Figure BDA0003735652960000083
Figure BDA0003735652960000083

具体实例如图2-图6所示。Specific examples are shown in Figures 2-6.

Claims (1)

1.一种基于深度特征进行鲁棒点云配准的无监督方法,其特征在于,包括如下步骤:1. an unsupervised method for robust point cloud registration based on depth features, is characterized in that, comprises the steps: 1)获取点云数据:获取待配准的点云数据,在点云表面均匀采样1024个点数据;1) Obtain point cloud data: obtain the point cloud data to be registered, and evenly sample 1024 point data on the point cloud surface; 2)转换:将得到的两幅点云的采样数据类型转换为张量,大小为1024×3,然后输入到深度学习框架中;2) Conversion: Convert the sampled data types of the obtained two point clouds to tensors with a size of 1024×3, and then input them into the deep learning framework; 3)特征提取:深度学习框架中编码器模块对输入的张量进行点云深度特征提取,最终输出表示点云深度特征的一维张量,具体过程如下:3) Feature extraction: The encoder module in the deep learning framework performs point cloud depth feature extraction on the input tensor, and finally outputs a one-dimensional tensor representing the point cloud depth feature. The specific process is as follows: 3-1)对输入的张量1024×3经过EdgeConv模块,将每一点作为中心点来表征其与各个邻点的边特征,再将这些特征聚合从而获得该点的新表征,即通过构建每个顶点的领域获取点云的局部特征,具体步骤为:3-1) The input tensor 1024×3 passes through the EdgeConv module, and uses each point as the center point to represent the edge features between it and each neighboring point, and then aggregates these features to obtain a new representation of the point, that is, by constructing each point. The field of each vertex obtains the local features of the point cloud, and the specific steps are: hθ(xi,xj)=hθ(xi,xj-xi) (1),h θ (x i ,x j )=h θ (x i ,x j -x i ) (1), 其中xi在顶点集合X={x1,...xn}∈RF中,F表示神经网络某一层输出的点的特征空间维度信息,接着再送入一个感知机得到边特征:Among them, x i is in the vertex set X={x 1 ,...x n }∈RF, and F represents the feature space dimension information of the point output by a certain layer of the neural network, and then sends it into a perceptron to obtain the edge feature: e'ijm=RELU(θm·(xj-xi)+φm·xi) (2),e' ijm =RELU(θ m ·(x j -x i )+φ m ·x i ) (2), 其中:in: Θ=(θ1,...,θM,φ1...,φM) (3),Θ=(θ 1 , ..., θ M , φ 1 ..., φ M ) (3), Θ为可学习参数,接着聚合各邻边的特征:Θ is a learnable parameter, and then aggregates the features of each adjacent edge:
Figure FDA0003735652950000011
Figure FDA0003735652950000011
其中
Figure FDA0003735652950000012
表示聚合操作,Ω表示以点xi为中心构成邻边的点对集合,具体聚合操作为:
in
Figure FDA0003735652950000012
represents the aggregation operation, and Ω represents the set of point pairs forming adjacent edges with the point xi as the center. The specific aggregation operation is:
Figure FDA0003735652950000013
Figure FDA0003735652950000013
最后在EdgeConv模块输出张量的大小为6×1024×20并输入下一卷积层;Finally, the size of the output tensor in the EdgeConv module is 6×1024×20 and input to the next convolutional layer; 3-2)对得到局部特征的张量进行五次卷积,第一次卷积经过64个二维大小为1的卷积核的卷积层输出张量大小为64×1024×20;3-2) Perform five convolutions on the tensors obtained with local features. The first convolution passes through 64 convolutional layers with two-dimensional convolution kernels of size 1. The output tensor size is 64×1024×20; 3-3)受注意力机制对深度学习效果提升显著的特点启发,将第一层的输出结果接入CBAM注意力模块,对各通道以及空间的重要性进行加权,并同理依次经过64,128,256大小的卷积层;3-3) Inspired by the feature that the attention mechanism significantly improves the effect of deep learning, the output of the first layer is connected to the CBAM attention module, and the importance of each channel and space is weighted, and the same goes through 64, 128, 256 size convolutional layers; 3-4)将前四层的输出张量拼接且修改张量大小为512×1024×1,再进入512个二维大小为1的卷积核进行卷积并修改维度得到张量大小为512×1024,使用flatten函数将张量打平输出张量大小为512,由此得到包含了局部与全局描述符的深度特征;3-4) Splicing the output tensors of the first four layers and modifying the size of the tensor to 512×1024×1, then entering 512 convolution kernels with a two-dimensional size of 1 for convolution and modifying the dimension to obtain a tensor size of 512 ×1024, use the flatten function to flatten the tensor and the output tensor size is 512, thus obtaining deep features including local and global descriptors; 4)点云配准:对两幅点云进行配准,即求出一个刚性变换矩阵G∈SE(3),分为旋转矩阵R∈SO(3)与平移矩阵t∈R3,然后求解出最小化目标函数F(G(R,t)):4) Point cloud registration: To register two point clouds, that is, to obtain a rigid transformation matrix G∈SE(3), which is divided into a rotation matrix R∈SO(3) and a translation matrix t∈R 3 , and then solve To minimize the objective function F(G(R,t)):
Figure FDA0003735652950000021
Figure FDA0003735652950000021
其中ψ:R3×N→RK为encoder学习到的特征提取函数,k是特征维度;Where ψ:R 3×N →R K is the feature extraction function learned by the encoder, and k is the feature dimension; 4-1)G(R,t)看作一个特殊李群,用指数映射表示如下:4-1) G(R, t) is regarded as a special Lie group, which is represented by exponential mapping as follows:
Figure FDA0003735652950000022
Figure FDA0003735652950000022
其中,T是指数映射G的生成器,ε是一个李代数,通过指数映射到变换矩阵G上;Among them, T is the generator of the exponential mapping G, ε is a Lie algebra, which is mapped to the transformation matrix G through the exponential; 4-2)点云配准问题描述为φ(PT)=φ(G·PT),求解矩阵G即通过对李代数ε求导得到G的导数,从而直接调整李代数ε来间接的优化最佳变换矩阵G和最小化两点云的特征空间投影误差进行点云配准任务,受(IC)公式的启发,先对φ(PT)=φ(G·PT)进行逆变换并将其右侧线性化:4-2) The point cloud registration problem is described as φ(PT)=φ(G·PT). To solve the matrix G, the derivative of G is obtained by taking the derivation of the Lie algebra ε, so as to directly adjust the Lie algebra ε to indirectly optimize the optimum. To optimize the transformation matrix G and minimize the feature space projection error of the two point clouds to perform the point cloud registration task, inspired by the (IC) formula, first inverse transform φ(PT)=φ(G PT) and right it Lateral linearization:
Figure FDA0003735652950000023
Figure FDA0003735652950000023
4-3)转换估计迭代运行计算增量ε,通过运行逆合成(IC)算法来估计每个步骤的ε:4-3) Transformation Estimation Iteratively runs the computation increment ε, by running the inverse synthesis (IC) algorithm to estimate ε for each step: ε=(JTJ)-1(JTδ) (9),ε = (J T J) -1 (J T δ) (9), 其中δ=φ(PS)-φ(PT)为两点云特征空间投影误差,
Figure FDA0003735652950000024
是投影误差δ相对于变换参数ε的雅可比矩阵;
where δ=φ(P S )-φ(P T ) is the projection error of the feature space of the two point clouds,
Figure FDA0003735652950000024
is the Jacobian matrix of the projection error δ relative to the transformation parameter ε;
4-4)为了有效的计算雅克比矩阵,采用了不同的有限梯度替代传统的随机梯度法来计算雅克比矩阵:4-4) In order to effectively calculate the Jacobian matrix, different finite gradients are used to replace the traditional stochastic gradient method to calculate the Jacobian matrix:
Figure FDA0003735652950000031
Figure FDA0003735652950000031
其中ti为计算期间变换参数的无穷小扰动,当ti设定为固定值时会得到比较好的效果,为ti设置了三个角参数用于旋转和三个扰动参数用于平移,将ti大小设置为2*e-2Among them, t i is the infinitesimal disturbance of the transformation parameters during the calculation. When t i is set to a fixed value, a better effect will be obtained. For t i , three angle parameters are set for rotation and three disturbance parameters are used for translation. t i size is set to 2*e -2 ; 4-5)通过迭代运算公式(9),对变换矩阵进行优化缩短源点云在特征空间上与样板点云的投影距离:4-5) Through the iterative operation formula (9), the transformation matrix is optimized to shorten the projection distance between the source point cloud and the template point cloud in the feature space: ΔG·PS→PS (11), ΔG ·PS→ PS (11), 其中
Figure FDA0003735652950000032
为公式(9)计算出变换参数ε得到的变换矩阵,经过多次迭代,不断变换PS与PT配准,最终输出最佳变换矩阵Gest
in
Figure FDA0003735652950000032
Calculate the transformation matrix obtained by the transformation parameter ε for formula (9). After many iterations, the registration of P S and P T is continuously transformed, and the optimal transformation matrix G est is finally output:
Gest=ΔGn·...·ΔG1ΔG0 (12),G est =ΔG n ··· ΔG 1 ΔG 0 (12), 5)训练:为了实现该深度学习框架以无监督的方式进行训练,引入了一个由四个全连接层组成的解码器;同时选择了ReLU作为前三层的激活函数,tanh函数为第四层的激活函数;5) Training: In order to realize the training of the deep learning framework in an unsupervised manner, a decoder consisting of four fully connected layers is introduced; at the same time, ReLU is selected as the activation function of the first three layers, and the tanh function is the fourth layer. The activation function of ; 5-1)在无监督的情况下,利用变换后的源点云和目标点云之间的对齐误差而不是地面真值变换来进行模型训练,采用一个鲁棒的损失函数:5-1) In the unsupervised case, use the alignment error between the transformed source point cloud and the target point cloud instead of the ground truth transformation for model training, using a robust loss function:
Figure FDA0003735652950000033
Figure FDA0003735652950000033
其中,PT∈Θ是从单位平方[0,1]2中采样的一组点,x是点云特征,
Figure FDA0003735652950000034
是卷积层参数中的第i个分量,Φ是原始输入的三维点。
where P T ∈ Θ is a set of points sampled from the unit square [0, 1] 2 , x is the point cloud feature,
Figure FDA0003735652950000034
is the ith component in the parameters of the convolutional layer, and Φ is the 3D point of the original input.
CN202210847039.1A 2022-07-07 2022-07-07 Unsupervised method for robust point cloud registration based on depth features Withdrawn CN115170626A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210847039.1A CN115170626A (en) 2022-07-07 2022-07-07 Unsupervised method for robust point cloud registration based on depth features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210847039.1A CN115170626A (en) 2022-07-07 2022-07-07 Unsupervised method for robust point cloud registration based on depth features

Publications (1)

Publication Number Publication Date
CN115170626A true CN115170626A (en) 2022-10-11

Family

ID=83494865

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210847039.1A Withdrawn CN115170626A (en) 2022-07-07 2022-07-07 Unsupervised method for robust point cloud registration based on depth features

Country Status (1)

Country Link
CN (1) CN115170626A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116188543A (en) * 2022-12-27 2023-05-30 中国人民解放军61363部队 Point cloud registration method and system based on deep learning unsupervised

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116188543A (en) * 2022-12-27 2023-05-30 中国人民解放军61363部队 Point cloud registration method and system based on deep learning unsupervised
CN116188543B (en) * 2022-12-27 2024-03-12 中国人民解放军61363部队 Point cloud registration method and system based on deep learning unsupervised

Similar Documents

Publication Publication Date Title
Wang et al. Category-level 6d object pose estimation via cascaded relation and recurrent reconstruction networks
Ranftl et al. Deep fundamental matrix estimation
Kothari et al. Trumpets: Injective flows for inference and inverse problems
CN106096561B (en) Infrared pedestrian detection method based on image block deep learning features
CN107229757B (en) Video retrieval method based on deep learning and hash coding
CN103295196B (en) Based on the image super-resolution rebuilding method of non local dictionary learning and biregular item
CN112364730B (en) Method and system for automatic classification of hyperspectral objects based on sparse subspace clustering
CN107092859A (en) A kind of depth characteristic extracting method of threedimensional model
Vidanapathirana et al. Spectral geometric verification: Re-ranking point cloud retrieval for metric localization
CN103295197A (en) Image super-resolution rebuilding method based on dictionary learning and bilateral holomorphy
CN106053988A (en) Inverter fault diagnosis system and method based on intelligent analysis
CN107203747B (en) Sparse combined model target tracking method based on self-adaptive selection mechanism
CN112488128A (en) Bezier curve-based detection method for any distorted image line segment
CN102651072A (en) Classification method for three-dimensional human motion data
CN115170626A (en) Unsupervised method for robust point cloud registration based on depth features
Mei Point cloud registration with self-supervised feature learning and beam search
Chen et al. An improved iterative closest point algorithm for rigid point registration
CN110852189A (en) A low-complexity dense crowd analysis method based on deep learning
Liu et al. Loop closure detection based on improved hybrid deep learning architecture
Zhang et al. Dyna-depthformer: Multi-frame transformer for self-supervised depth estimation in dynamic scenes
CN113884025A (en) Additive manufacturing structured light loopback detection method, device, electronic device and storage medium
CN111951287A (en) Two-dimensional code detection and recognition method
Yang et al. 3dpmnet: Plane segmentation and matching for point cloud registration
CN112509018A (en) Quaternion space optimized three-dimensional image registration method
Zhang et al. Vision-based UAV obstacle avoidance algorithm on the embedded platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20221011

WW01 Invention patent application withdrawn after publication