CN115937043B

CN115937043B - A method of touch-assisted point cloud completion

Info

Publication number: CN115937043B
Application number: CN202310009699.7A
Authority: CN
Inventors: 王琴; 王怀钰; 石键瀚; 李剑
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2023-01-04
Filing date: 2023-01-04
Publication date: 2023-07-04
Anticipated expiration: 2043-01-04
Also published as: CN115937043A

Abstract

The invention belongs to the field of three-dimensional point cloud completion, and particularly relates to a three-dimensional point cloud completion method based on haptic assistance. The problem of partial detail deficiency in the process of generating the complement point cloud by the missing point cloud under a single view angle is solved, and the point cloud complement effect is improved by utilizing the partial tactile information. Mainly comprises the following steps: step 1, initializing a Pybullet simulation environment, and acquiring a touch picture by using a mechanical arm connected with an electric clamping jaw and a touch sensor; step 2, converting the touch picture into a touch point cloud, and splicing the touch point cloud with the missing point cloud; step 3, training the PoinTr network by using the data set; and 4, inputting the spliced point cloud into a PoinTr network to obtain a completed point cloud.

Description

A method of touch-assisted point cloud completion

技术领域technical field

本发明属于三维点云补全领域，具体涉及基于触觉辅助点云补全的方法。The invention belongs to the field of three-dimensional point cloud completion, and in particular relates to a tactile-assisted point cloud completion method.

背景技术Background technique

理解3D空间对于人类和机器理解如何与周围的物体交互至关重要。点云作为一种容易获取的3D结构数据，促进了计算机视觉在理解3D场景和物体方面的广泛的研究。然而，激光雷达扫描仪或RGB-D相机常规捕获的原始点云，由于传感器分辨率、物体遮挡和物体表面材质等的限制，不可避免地存在稀疏和不完整的缺点。受益于大规模点云数据集，基于深度学习的点云补全方法已经吸引了越来越多的研究兴趣。最近，基于三维点云处理技术的进展促进了点云补全研究。开创性工作PointNet提出在每个点上独立应用MLP，通过池化操作聚合特征，以实现排列不变性。PointCleanNet是第一个基于学习的体系结构，提出了编码器-解码器框架，并采用FoldingNet通过模拟二维平面将二维点映射到三维曲面上，引入由粗到细的补全策略，在缺失的部分逐渐恢复细节。SeedFormer引入了一种新的形状表示方法，即Patch Seeds，它可以从部分输入中捕获总体结构，并保留局部区域信息，同时在补全任务中引入了一个上采样Transformer模块。PoinTr将点云补全定义为集到集的转换问题，将点云表示为一组具有位置嵌入的无序点组，将缺失点云转换为一组点代理，嵌入特殊的几何感知块，使用基于Transformer的编码器-解码器架构生成完整点云。由于点云的离散性和点云预测的局部区域非结构化等问题，上述方法难以保持局部区域良好的点的分布结构，进而无法很好的捕捉局部区域的几何细节和结构，如光滑的区域，锋利的边缘和角落。Understanding 3D space is critical for humans and machines to understand how to interact with objects around them. Point clouds, as an easily accessible 3D structural data, have facilitated extensive research in computer vision for understanding 3D scenes and objects. However, the raw point clouds routinely captured by lidar scanners or RGB-D cameras are inevitably sparse and incomplete due to the limitations of sensor resolution, object occlusion, and object surface materials. Benefiting from large-scale point cloud datasets, deep learning-based point cloud completion methods have attracted increasing research interest. Recently, advances in 3D point cloud-based processing techniques have facilitated point cloud completion research. The seminal work PointNet proposes to apply MLP independently at each point, aggregating features through pooling operation to achieve permutation invariance. PointCleanNet is the first learning-based architecture, which proposes an encoder-decoder framework, and uses FoldingNet to map two-dimensional points to three-dimensional surfaces by simulating a two-dimensional plane, and introduces a completion strategy from coarse to fine. Parts gradually recover details. SeedFormer introduces a new shape representation method, Patch Seeds, which can capture the overall structure from partial inputs and preserve local region information, while introducing an upsampling Transformer module in the completion task. PoinTr defines point cloud completion as a set-to-set transformation problem, representing point clouds as a set of unordered point groups with positional embeddings, converting missing point clouds into a set of point proxies, embedding special geometry-aware blocks, using Generate a complete point cloud based on Transformer's encoder-decoder architecture. Due to the discreteness of the point cloud and the unstructured local area predicted by the point cloud, it is difficult for the above method to maintain a good point distribution structure in the local area, and thus cannot capture the geometric details and structure of the local area well, such as a smooth area. , sharp edges and corners.

触觉是感知物体3D形状的另一种方式。对于机器人而言，大多数触觉传感器测量的是接触面上的几何形状。机器人可以通过多次触摸，结合每次触摸时传感器的位置和姿态，重建物体的形状，而不会受到物体表面颜色或材质造成的模糊性的影响。然而，触觉信息受到传感器尺寸和规模的限制:由于每次触摸只能获得局部区域的信息，需要多次触摸和很长时间才能重建物体的完整形状，因此单一的只基于触觉的进行的点云重建在实际中很难得到应用。The sense of touch is another way of perceiving the 3D shape of objects. For robotics, most tactile sensors measure the geometry of the contact surface. The robot can reconstruct the shape of an object through multiple touches, combined with the position and pose of the sensor at each touch, without being affected by ambiguity caused by the color or material of the object's surface. However, tactile information is limited by the size and scale of the sensor: since each touch can only obtain information on a local area, multiple touches and a long time are required to reconstruct the complete shape of the object, so a single point cloud based only on tactile Reconstruction is difficult to apply in practice.

因此，在三维点云补全领域，目前需要探索一种既能利用神经网络的学习能力，又能合理利用物体局部触摸信息的点云补全方法，以此提高点云补全的效率及补全效果。Therefore, in the field of 3D point cloud completion, it is necessary to explore a point cloud completion method that can not only use the learning ability of neural networks, but also reasonably use the local touch information of objects, so as to improve the efficiency and complementarity of point cloud completion. full effect.

发明内容Contents of the invention

现有的三维点云补全方法存在一定的局限性，比如：基于神经网络的点云补全方法以缺失点云作为网络输入，缺失区域点云的补全任务中往往难以捕捉局部的几何细节结构，导致复杂物体的补全重建效果较差；而单靠触觉信息进行的物体重建在每次触摸时会受到传感器尺寸和规模的限制:由于每次触摸只能获得局部区域的信息，重建效率往往是低效的。在本发明中，提出了一套基于DIGIT模拟触觉点云辅助的三维点云补全方法。在Pybullet模拟引擎中，使用连接有电动夹爪的机械臂及Facebook提供的DIGIT触觉传感器，可控地获取物体表面的触觉信息，产生对应触摸位置的触觉点云。同时，利用3D深度学习和大规模3D形状存储库，在机器学习过程中获得物体的隐式形状先验。触觉点云包含物体局部几何信息，把局部的触觉点云融入不完整点云中，利用机器学习约束全局并优化形状，弥补了普通的机器学习网络无法捕捉局部细节的缺点。Existing 3D point cloud completion methods have certain limitations. For example, point cloud completion methods based on neural networks use missing point clouds as network input, and it is often difficult to capture local geometric details in the completion task of point clouds in missing regions. structure, resulting in poor complementary reconstruction of complex objects; while object reconstruction based solely on tactile information will be limited by the size and scale of the sensor at each touch: since each touch can only obtain information on a local area, the reconstruction efficiency Often inefficient. In the present invention, a set of 3D point cloud completion method based on DIGIT simulation tactile point cloud assistance is proposed. In the Pybullet simulation engine, the robotic arm connected with the electric gripper and the DIGIT tactile sensor provided by Facebook are used to obtain controllable tactile information on the surface of the object and generate a tactile point cloud corresponding to the touch position. At the same time, using 3D deep learning and a large-scale 3D shape repository, the implicit shape prior of the object is obtained during the machine learning process. The tactile point cloud contains the local geometric information of the object, integrates the local tactile point cloud into the incomplete point cloud, and uses machine learning to constrain the overall situation and optimize the shape, making up for the shortcomings of ordinary machine learning networks that cannot capture local details.

本发明提供了一种触觉辅助点云补全的方法，包括如下步骤：The invention provides a method for tactile-assisted point cloud completion, comprising the following steps:

步骤1，初始化Pybullet仿真模拟环境，使用连接有电动夹爪的机械臂和DIGIT触觉传感器选取物体点云缺失区域进行触摸，获取该触摸区域的触觉图片及位姿信息；Step 1, initialize the Pybullet simulation environment, use the robotic arm connected with the electric gripper and the DIGIT tactile sensor to touch the missing area of the object point cloud, and obtain the tactile image and pose information of the touched area;

步骤2，将触觉图片转化为初步触觉点云，再将初步触觉点云从世界坐标系转化到目标特征坐标系下，与物体缺失点云进行拼接；Step 2, convert the tactile image into a preliminary tactile point cloud, and then transform the preliminary tactile point cloud from the world coordinate system to the target feature coordinate system, and stitch it with the object missing point cloud;

步骤3，使用数据集训练PoinTr网络；Step 3, use the dataset to train the PointTr network;

步骤4，将步骤2中拼接后的点云输入到训练好的PoinTr网络，得到补全后的完整点云。Step 4, input the point cloud spliced in step 2 into the trained PoinTr network to obtain the complete point cloud after completion.

进一步的，步骤1中获取物体点云缺失区域的触觉图片，具体为：移动机械臂并控制夹爪的闭合，触碰物体点云缺失区域的表面，产生该触摸区域的触觉图片；Further, in step 1, obtain the tactile image of the missing area of the point cloud of the object, specifically: move the robotic arm and control the closing of the gripper, touch the surface of the missing area of the object point cloud, and generate a tactile image of the touched area;

记录下此时DIGIT触觉传感器上捕获的物体表面深度信息H∈R^n×m和DIGIT触觉传感器中心的空间坐标x∈R³及旋转向量r∈R³，以确定该触摸区域在世界坐标系中的位姿。Record the surface depth information H∈R ^n×m of the object captured on the DIGIT tactile sensor at this time, the space coordinate x∈R ³ and the rotation vector r∈R ³ of the center of the DIGIT tactile sensor, to determine the touch area in the world coordinate system pose.

进一步的，步骤2中将触觉图片转化为初步触觉点云，再将初步触觉点云从世界坐标系转化到目标特征坐标系下，与物体缺失点云进行拼接，具体包括如下步骤：Further, in step 2, the tactile image is converted into a preliminary tactile point cloud, and then the preliminary tactile point cloud is transformed from the world coordinate system to the target feature coordinate system, and spliced with the object missing point cloud, specifically including the following steps:

步骤2.1，以DIGIT触觉传感器的中心点作为中心，在该中心点的切平面构建初始化平面点云；Step 2.1, take the center point of the DIGIT tactile sensor as the center, and construct an initialization plane point cloud on the tangent plane of the center point;

步骤2.2，通过在初始化平面点云上叠加触摸区域的深度信息H得到初步触觉点云P_c。接着将触摸时所记录的DIGIT触觉传感器的旋转向量r转化为旋转矩阵R，再结合空间坐标x，得到世界坐标系下的触觉点云P_t:In step 2.2, the preliminary haptic point cloud P _c is obtained by superimposing the depth information H of the touch area on the initialization plane point cloud. Then, the rotation vector r of the DIGIT tactile sensor recorded during touch is transformed into a rotation matrix R, and combined with the space coordinate x, the tactile point cloud P _t in the world coordinate system is obtained:

P_t＝P_c*R+xP _t =P _c *R+x

步骤2.3，将世界坐标系下的触觉点云P_t与物体缺失点云进行拼接，输入到PoinTr网络中，得到补全后的完整点云。In step 2.3, the tactile point cloud P _t in the world coordinate system is spliced with the missing point cloud of the object, and input into the PointTr network to obtain the complete point cloud after completion.

进一步的，分别对缺失点云选取不同触摸区域添加触摸，重复步骤1-4，选取合适的触摸位置有利于重建点云的细节。Further, select different touch areas to add touch to the missing point cloud, repeat steps 1-4, and select the appropriate touch position to help reconstruct the details of the point cloud.

有益效果：利用物体局部触摸信息的点云补全方法，把局部的触觉点云融入不完整点云中，以此提高了点云补全的效率及补全效果。Beneficial effects: the point cloud completion method using the local touch information of the object integrates the local tactile point cloud into the incomplete point cloud, thereby improving the efficiency and effect of point cloud completion.

附图说明Description of drawings

图1是本发明方法的流程图。Figure 1 is a flow chart of the method of the present invention.

图2是Pybullet仿真模拟环境下触摸物体及触觉图片可视化图。Figure 2 is a visualization of touching objects and tactile pictures in the Pybullet simulation environment.

图3是点云补全网络整体框架图。Figure 3 is the overall framework diagram of the point cloud completion network.

图4是添加一次触摸的点云补全效果图。Figure 4 is the effect diagram of point cloud completion with one touch added.

图5是不同触摸次数的点云补全效果图。Figure 5 is the effect diagram of point cloud completion with different touch times.

图6是不同触摸位置的点云补全效果图。Figure 6 is an effect diagram of point cloud completion at different touch positions.

具体实施方式Detailed ways

为了加深对本发明的理解，下面将结合实施例对本发明作进一步的详述，本实施例仅用于解释本发明，并不构成对本发明保护范围的限定。In order to deepen the understanding of the present invention, the present invention will be further described below in conjunction with the examples, which are only used to explain the present invention, and do not constitute a limitation to the protection scope of the present invention.

本发明的基于触觉辅助点云补全的方法，如图1所示，包括如下步骤：The method for point cloud completion based on tactile sense of the present invention, as shown in Figure 1, comprises the following steps:

在Pybullet模拟引擎中，搭建了连接有二指电动夹爪的机械臂，在夹爪的每个手指上配有一个Facebook提供的DIGIT触觉传感器用于获取触觉图像。将物体摆放于机械臂前方的固定位置，初始时电动夹爪位于目标物体的正上方。随后，移动机械臂并控制夹爪的闭合，使之触碰目标物体的表面，如图2(a)所示，直到DIGIT触觉传感器反馈图像上出现了明显的接触图像后停止移动，物体触摸区域的触觉图片来自DIGIT触觉传感器所提供的接触面的深度信息；In the Pybullet simulation engine, a mechanical arm connected with a two-finger electric gripper is built, and each finger of the gripper is equipped with a DIGIT tactile sensor provided by Facebook to obtain tactile images. Place the object at a fixed position in front of the robotic arm. Initially, the electric gripper is directly above the target object. Subsequently, move the robotic arm and control the closing of the jaws to make it touch the surface of the target object, as shown in Figure 2(a), until the DIGIT tactile sensor feedback image appears obvious contact image and then stops moving, and the object touches the area The tactile image comes from the depth information of the contact surface provided by the DIGIT tactile sensor;

如图2(b)所示；记录下此时DIGIT触觉传感器上得到的深度信息H∈R^n×m和DIGIT触觉传感器中心的空间坐标x∈R³和旋转向量r∈R³，以确定该触摸区域在世界坐标系中的位姿。As shown in Figure 2(b); record the depth information H∈R ^n×m obtained on the DIGIT tactile sensor at this time, the space coordinate x∈R ³ and the rotation vector r∈R ³ of the center of the DIGIT tactile sensor to determine the The pose of the touch area in the world coordinate system.

步骤2，将触觉图片转化为初步触觉点云，再将初步触觉点云从世界坐标系下转化到物体坐标系下，与物体缺失点云进行拼接，具体步骤如下：Step 2, convert the tactile image into a preliminary tactile point cloud, and then convert the preliminary tactile point cloud from the world coordinate system to the object coordinate system, and stitch it with the object missing point cloud. The specific steps are as follows:

步骤2.2，通过在初始化点云平面上叠加触摸区域的深度信息H得到初步触觉点云为P_c；接着将触摸时所记录的DIGIT触觉传感器的旋转向量r转化为旋转矩阵R，再结合空间坐标x，得到世界坐标系下的触觉点云P_t:Step 2.2, by superimposing the depth information H of the touch area on the initialization point cloud plane to obtain the preliminary tactile point cloud as P _c ; then transform the rotation vector r of the DIGIT tactile sensor recorded during the touch into a rotation matrix R, and then combine the space coordinates x, get the tactile point cloud P _t in the world coordinate system:

P_t＝P_c*R+xP _t =P _c *R+x

通过给定的阈值对初步的触觉点云P_t去除噪声；Denoise the preliminary tactile point cloud _Pt by a given threshold;

步骤2.3，用DIGIT触觉传感器的空间坐标x与旋转向量r将初步触觉点云P_t转变到物体坐标系下的对应位置，并与目标物体缺失点云进行拼接，具体说明如下：Step 2.3, use the space coordinate x and the rotation vector r of the DIGIT tactile sensor to transform the preliminary tactile point cloud P _t to the corresponding position in the object coordinate system, and splicing with the missing point cloud of the target object. The specific instructions are as follows:

目标物体缺失点云位于物体坐标系，触觉点云P_t位于世界坐标系；将物体坐标系、机器人坐标系和世界坐标系对齐，以保证物体缺失点云和触觉点云P_t在同一个坐标系中；The missing point cloud of the target object is located in the object coordinate system, and the tactile point cloud P _t is located in the world coordinate system; align the object coordinate system, the robot coordinate system and the world coordinate system to ensure that the missing object point cloud and the tactile point cloud P _t are in the same coordinates Department;

假定世界坐标系和机器人坐标系重合，只需对齐物体坐标系和机器人坐标系即可。取机器人坐标系中的不重合的三个点w1,w2,w3,并记录下三点在目标物体坐标系下的位置r1,r2,r3，通过求解线性方程得到机器人坐标系与物体坐标系对齐所需要的旋转矩阵R_r和平移向量T_r：Assuming that the world coordinate system and the robot coordinate system coincide, you only need to align the object coordinate system and the robot coordinate system. Take the three non-overlapping points w1, w2, w3 in the robot coordinate system, and record the positions r1, r2, r3 of the three points in the target object coordinate system, and obtain the alignment between the robot coordinate system and the object coordinate system by solving the linear equation The required rotation matrix R _r and translation vector T _r :

X_r＝R_r*X_w+T_r X _r ＝R _r *X _w +T _r

X_r＝[r1,r2,r3]X _r =[r1,r2,r3]

X_w＝[w1,w2,w3]X _w = [w1,w2,w3]

为了简化计算过程，假设目标物体坐标系和机器人坐标系旋转矩阵R_r为单位矩阵，由此只需求出两个坐标系间的平移向量T_r。将目标物体放置在机器人坐标系x轴方向上的一个固定的位置w1＝(m,0,0)^T，对于目标物体坐标系下的某个点p_i对应到In order to simplify the calculation process, it is assumed that the target object coordinate system and the robot coordinate system rotation matrix R _r are unit matrices, so only the translation vector T _r between the two coordinate systems is required. Place the target object at a fixed position w1=(m,0,0) ^T in the x-axis direction of the robot coordinate system, and a point p _i in the target object coordinate system corresponds to

机器人坐标系中的p_j有:The p _j in the robot coordinate system are:

p_j＝p_i+w1p _j =p _i +w1

由于得到的触觉点云过于密集会影响后续PoinTr网络局部注意力模块的表现，为了避免模型过于关注局部信息而忽视长程信息，需要对触觉点云使用最远点采样的方法降采样到点数为100。通过坐标系之间的转换，可以将最终得到的触觉点云与目标物体缺失点云拼接到一起。Since the obtained tactile point cloud is too dense, it will affect the performance of the subsequent PoinTr network local attention module. In order to prevent the model from paying too much attention to local information and ignoring long-range information, it is necessary to use the farthest point sampling method to downsample the tactile point cloud to 100 points. . Through the conversion between the coordinate systems, the final tactile point cloud and the missing point cloud of the target object can be stitched together.

PoinTr网络是基于Transformer编码器-解码器结构,包括特征提取模块、几何感知编码器模块、几何感知解码器模块和上采样模块。The PoinTr network is based on Transformer encoder-decoder structure, including feature extraction module, geometry-aware encoder module, geometry-aware decoder module and upsampling module.

首先经过特征提取模块对拼接处理后所得到的点云提取特征向量，之后将特征向量输入到几何感知编码器模块，通过几何感知编码器模块建立点云之间的几何关系；几何感知解码器模块查询点云间的几何关系生成缺失区域的预测点代理及代理特征。First, the feature extraction module extracts the feature vector from the point cloud obtained after splicing, and then inputs the feature vector to the geometric perception encoder module, and establishes the geometric relationship between the point clouds through the geometric perception encoder module; the geometry perception decoder module Query the geometric relationship between point clouds to generate predicted point proxies and proxy features for missing regions.

最后，将预测点代理和代理特征输入到上采样模块，利用上采样模块恢复以预测点代理为中心的详细局部形状，输出补全后的完整点云。Finally, the predicted point proxy and proxy features are input to the upsampling module, which is used to restore the detailed local shape centered on the predicted point proxy, and output the completed point cloud.

所述训练数据集为ShapeNet-55，包含来自55个类别的41,952个物体模型。对于每个物体模型，从物体表面随机采样8192个作为完整点云，考虑到测试缺失点云视角的不确定性，先随机选择一个视点，然后去除该视点最远的4096个点，得到训练的局部缺失点云。网络整体框架如图3所示。The training dataset is ShapeNet-55, which contains 41,952 object models from 55 categories. For each object model, 8192 points are randomly sampled from the surface of the object as a complete point cloud. Considering the uncertainty of the viewing angle of the missing point cloud in the test, a point of view is randomly selected first, and then the 4096 points farthest from the point of view are removed to obtain the training point cloud. Partially missing point clouds. The overall framework of the network is shown in Figure 3.

步骤4，将拼接后的点云输入到训练好的PoinTr网络，得到补全后的完整点云。如需要增加触摸点或次点，则返回步骤1；Step 4: Input the spliced point cloud into the trained Pointr network to obtain the complete point cloud after completion. If you need to add touch points or secondary points, return to step 1;

针对DIGIT触觉辅助点云补全效果的验证，进行了多项实验，包括添加一次触摸对点云补全的影响，不同触摸次数对点云补全的影响以及不同触摸位置对点云补全的影响。实验阶段使用的物体模型选自ShapeNet-55训练数据集。本方法使用PoinTr网络来实现缺失点云的重建。PoinTr网络是基于Transfromer编码器和解码器结构，同时在编码器和解码器中引入了局部注意力模块，可以有效地完成点云补全任务。以下实验结果皆是本发明方法与PoinTr网络基准之间的对比。For the verification of the effect of DIGIT tactile-assisted point cloud completion, a number of experiments were carried out, including the impact of adding one touch on point cloud completion, the impact of different touch times on point cloud completion, and the effect of different touch positions on point cloud completion. Influence. The object model used in the experimental phase is selected from the ShapeNet-55 training dataset. This method uses the PointTr network to achieve the reconstruction of missing point clouds. The PoinTr network is based on the Transfromer encoder and decoder structure, and introduces a local attention module in the encoder and decoder, which can effectively complete the point cloud completion task. The following experimental results are all comparisons between the method of the present invention and the PoinTr network benchmark.

添加一次触摸对点云补全影响的实验结果对比如表1所示。表1中数据的评价标准是补全重建点云与真实点云之间的倒角距离(CD2)，数值越小代表补全重建效果越好。Table 1 shows the experimental results of the effect of adding a touch on point cloud completion. The evaluation standard of the data in Table 1 is the chamfer distance (CD2) between the complementary reconstruction point cloud and the real point cloud. The smaller the value, the better the complementary reconstruction effect.

表1添加一次触摸的点云补全结果对比Table 1 Comparison of point cloud completion results with one touch added

图4展示了以球体、正方体、吉他和水桶为例的补全效果，由图可见，所有物体都可以被重建，且绝大部分物体在增添触觉信息后能被重建得更好。对于较为复杂的物体，如水桶和吉他，在没有添加触觉点云之前，形状的细节没有办法得到很好的重建，在使用触觉辅助后才能得到较为贴合实际物体。对于较为简单的物体，如球体和立方体，加入触觉信息可能会导致重建结果变差，原因其一是模型本身过于规则和简单，仅仅凭借局部缺失点云就能完成较好的重建效果；其二，当模型比较小时，DIGIT触觉模拟器在其深度图成像范围边缘会出现镜头畸变。但考虑到实际应用时大多数情况下都是针对较复杂物体的重建，所以方法在多数场景下是可行的。Figure 4 shows the completion effects of spheres, cubes, guitars, and buckets as examples. It can be seen from the figure that all objects can be reconstructed, and most objects can be reconstructed better after adding tactile information. For more complex objects, such as buckets and guitars, the details of the shape cannot be well reconstructed without adding tactile point clouds, and the actual objects can only be obtained after using tactile assistance. For relatively simple objects, such as spheres and cubes, adding tactile information may lead to poor reconstruction results. One of the reasons is that the model itself is too regular and simple, and good reconstruction results can be achieved only by locally missing point clouds; , when the model is relatively small, the DIGIT haptic simulator will have lens distortion at the edge of its depth map imaging range. However, considering that most of the practical applications are for the reconstruction of more complex objects, the method is feasible in most scenarios.

添加多次触摸对点云补全重建影响的实验结果对比如图5和表2所示，绝大部分物体在随之添加更多次触觉之后都能重建的更好。The experimental results of adding multiple touches to the point cloud completion and reconstruction are compared as shown in Figure 5 and Table 2. Most objects can be reconstructed better after adding more touches.

表2是不同触摸次数的点云补全结果对比Table 2 is a comparison of point cloud completion results for different touch times

图5中展示了其中水桶、吉他和篮子的重建结果。图中每类物体的数据表示如下：图5中(a)真实点云；(b)添加三次触觉的缺失点云；(c)未添加触觉的补全结果；(d)添加一次触觉的补全结果；(e)添加二次触觉的补全结果；(f)添加三次触觉的补全结果。这三个模型在不添加触觉时没有办法恢复出来的另一侧缺少了局部细节，如水桶和篮子缺失了另一边的把手，吉他则没有很好的重建缺失的形状，但在添加触觉点云之后，它们另一侧的局部细节得到了较好的恢复，同时原本发散的点云也变得更加聚集。总之，对于三个模型而言，随着添加触觉点云数目的增多，局部细节更加符合实际的样子，未添加触摸时网络输出的补全点云中，原本缺失区域的点云恢复得较为稀疏和发散，在添加多次触觉点云后可以恢复得更加稠密和聚集。Figure 5 shows the reconstruction results of the bucket, guitar and basket. The data of each type of object in the figure is represented as follows: in Figure 5 (a) the real point cloud; (b) the missing point cloud with three tactile additions; (c) the completion result without tactile addition; (d) the complementary tactile addition (e) the completion result of adding the second haptic; (f) the completion result of adding the third haptic. The other side of these three models is missing local details without adding tactile sensation, such as the bucket and basket missing the handle on the other side, and the guitar does not reconstruct the missing shape very well, but adding tactile point cloud Afterwards, the local details on the other side of them are better recovered, while the originally divergent point clouds become more clustered. In short, for the three models, as the number of added tactile point clouds increases, the local details are more in line with reality. In the complementary point cloud output by the network when no touch is added, the point cloud of the original missing area is restored relatively sparsely. and divergence, which can be restored to be more dense and clustered after adding multiple haptic point clouds.

不同触摸位置对点云补全重建影响如图6所示，以水桶、椅子、吉他为例，选取合适的触摸位置可以更好地重建点云的细节。图中每列数据表示如下：图6中(a)真实点云；(b)在位置1添加一次触摸；(c)对应(b)的补全结果；(d)在位置2添加一次触摸；(e)对应(d)的补全结果。所述位置1和位置2是为了区分两次添加的触摸区域不同，并不限定确切位置；实验结果表明，如果在补全效果不好的区域添加触摸，会很大程度上改善补全结果，如图中的水桶，当触摸位置在水桶中上部时，重建结果没有办法获取对应的局部细节，当触摸位置位于水桶的把手，点云的细节得到了很好的重建；对于椅子和吉他，当触摸区域在添加触摸前已经恢复的较好了，重建时触觉点云无法很好的发挥其作用，导致最后的重建结果达不到理想目标，这与上述简单形状的物体添加触摸得到的结论基本一致。The impact of different touch positions on point cloud completion and reconstruction is shown in Figure 6. Taking buckets, chairs, and guitars as examples, selecting an appropriate touch position can better reconstruct the details of the point cloud. Each column of data in the figure is represented as follows: in Figure 6 (a) real point cloud; (b) add a touch at position 1; (c) corresponding to the completion result of (b); (d) add a touch at position 2; (e) corresponds to the completion result of (d). The position 1 and position 2 are to distinguish the two added touch areas, and the exact position is not limited; the experimental results show that if the touch is added to the area where the completion effect is not good, the completion result will be greatly improved. As shown in the bucket in the picture, when the touch position is in the middle and upper part of the bucket, the reconstruction result cannot obtain the corresponding local details. When the touch position is at the handle of the bucket, the details of the point cloud are well reconstructed; for chairs and guitars, when The touch area has recovered well before the touch is added, and the tactile point cloud cannot play its role well during reconstruction, resulting in the final reconstruction result not reaching the ideal goal, which is basically the same as the conclusion obtained by adding touch to the above-mentioned simple shape object unanimous.

需要强调的是，本方法所适用的模拟环境以Facebook所提供的DIGIT触觉传感器为例，但不局限于此，对于其它众多类型的触觉传感器同样适用；本方法是在点云补全的背景下展开的，方法中所展示的模拟环境搭建与实验过程，包括模拟环境的搭建、模拟触觉点云的获取、不同模态点云数据的拼接等，同样适用于包括触觉、视觉点云融合在内的多模态点云处理任务。It should be emphasized that the simulation environment applicable to this method takes the DIGIT tactile sensor provided by Facebook as an example, but it is not limited to this, and it is also applicable to many other types of tactile sensors; this method is based on the background of point cloud completion Expanded, the simulated environment construction and experimental process shown in the method, including the construction of the simulated environment, the acquisition of simulated tactile point clouds, the splicing of different modal point cloud data, etc., are also applicable to fusion of tactile and visual point clouds. Multimodal point cloud processing tasks.

Claims

1. A method for tactile-assisted point cloud completion, comprising the steps of:

Step 1, initialize the Pybullet simulation environment, use the robotic arm connected with the electric gripper and the DIGIT tactile sensor to touch the missing area of the object point cloud, and obtain the tactile image and pose information of the touched area;

Among them, to obtain the tactile picture of the missing area of the point cloud of the object, specifically: move the robotic arm and control the closing of the gripper, touch the surface of the missing area of the point cloud of the object, and generate a tactile picture of the touched area; record the DIGIT tactile sensor at this time The depth information H∈R ^n×m of the surface of the object captured above and the space coordinate x∈R ³ and the rotation vector r∈R ³ of the center of the DIGIT tactile sensor to determine the pose of the touch area in the world coordinate system;

Step 2, convert the tactile image into a preliminary tactile point cloud, and then transform the preliminary tactile point cloud from the world coordinate system to the target feature coordinate system, and stitch it with the object missing point cloud, specifically including the following steps:

Step 2.1, take the center point of the DIGIT tactile sensor as the center, and construct an initialization plane point cloud on the tangent plane of the center point;

Step 2.2, by superimposing the depth information H of the touch area on the initialization plane point cloud to obtain the preliminary tactile point cloud P _c ; then transform the rotation vector r of the DIGIT tactile sensor recorded during the touch into a rotation matrix R, and then combine the space coordinate x , to get the tactile point cloud P _t in the world coordinate system:

P _t =P _c *R+x

Step 2.3, splicing the tactile point cloud P _t in the world coordinate system and the missing point cloud of the object, and inputting it into the PointTr network to obtain the complete point cloud after completion;

Step 3, use the dataset to train the PointTr network;

Step 4, input the point cloud spliced in step 2 into the trained PoinTr network to obtain the complete point cloud after completion.

2. A method for tactile-assisted point cloud completion according to claim 1, characterized in that, selecting different touch areas to add touches to the missing point cloud respectively, and repeating steps 1-4, is conducive to reconstructing the details of the point cloud.

3. according to the described method of a kind of tactile sense auxiliary point cloud complementing of claim 1, it is characterized in that, the specific content of step 3 is as follows: target object missing point cloud is positioned at object coordinate system, and tactile sense point cloud _P is positioned at world coordinate system; Align the object coordinate system, the robot coordinate system and the world coordinate system to ensure that the object missing point cloud and the tactile point cloud P _t are in the same coordinate system.

4. A method for tactile-assisted point cloud completion according to claim 3, characterized in that, assuming that the world coordinate system and the robot coordinate system coincide, it is only necessary to align the object coordinate system and the robot coordinate system, and take the robot coordinate system The three non-overlapping points w1, w2, w3, and record the positions r1, r2, r3 of the three points in the coordinate system of the target object, and obtain the rotation required to align the coordinate system of the robot with the coordinate system of the object by solving the linear equation Matrix R _r and translation vector T _r :

X _r ＝R _r *X _w +T _r

X _r = [r1, r2, r3]

_Xw = [w1, w2, w3].

5. according to the described method of a kind of tactile-assisted point cloud completion of claim 4, it is characterized in that, assuming that the target object coordinate system and the robot coordinate system rotation matrix R _r are identity matrix, only the translation between the two coordinate systems is required Vector T _r ; place the target object at a fixed position w1=(m,0,0) ^T in the x-axis direction of the robot coordinate system, and a point p _i in the target object coordinate system corresponds to the robot coordinate system The p _j have:

p _j =p _i +w1.

6. A method for tactile-assisted point cloud completion according to claim 1, characterized in that the tactile point cloud is down-sampled to 100 points by sampling the farthest point.

7. The method for a kind of tactile-assisted point cloud completion according to claim 1, wherein the PointTr network is based on a Transformer encoder-decoder structure.

8. a kind of tactile-assisted point cloud complementing method according to claim 1, is characterized in that, described PoinTr network comprises feature extraction module, geometry perception coder module, geometry perception decoder module and upsampling module;

First, the feature extraction module extracts the feature vector from the point cloud obtained after splicing, and then inputs the feature vector to the geometric perception encoder module, and establishes the geometric relationship between the point clouds through the geometric perception encoder module; the geometry perception decoder module Query the geometric relationship between point clouds to generate the predicted point proxy and proxy features of the missing area; finally, input the predicted point proxy and proxy features into the upsampling module, use the upsampling module to restore the detailed local shape centered on the predicted point proxy, and output Complete point cloud after completion.