CN111598946B

CN111598946B - Object pose measuring method and device and storage medium

Info

Publication number: CN111598946B
Application number: CN202010182093.XA
Authority: CN
Inventors: 沈跃佳; 贾奎
Original assignee: South China University of Technology SCUT
Current assignee: Cross Dimension Shenzhen Intelligent Digital Technology Co ltd
Priority date: 2020-03-16
Filing date: 2020-03-16
Publication date: 2023-03-21
Anticipated expiration: 2040-03-16
Also published as: CN111598946A

Abstract

The invention discloses an object pose measuring method, a device and a storage medium, wherein the method comprises an off-line modeling stage and an on-line matching stage, wherein the off-line modeling stage is used for carrying out feature modeling on an object three-dimensional model and storing the feature modeling for measuring the pose of an object in a subsequent scene; in the on-line matching stage, object pose measurement is carried out on a given scene RGB-D image; the invention provides an efficient model sampling strategy, which can reduce the subsequent operation amount; but also can keep enough object surface change information; the distance range of the point pair characteristic is limited, and the matching interference of excessive background point clouds is reduced; a quantization expansion method is provided, so that the influence of noise on the deviation of the point on the feature calculation is reduced; edge information is extracted from the color image, candidate object poses are screened and ICP registration is carried out, measurement accuracy is improved, and therefore scene recognition rate under the conditions of shielding, gathering, stacking and the like is more accurate. The invention is widely applied to the field of three-dimensional computer vision.

Description

Method, device and storage medium for measuring object pose

技术领域technical field

本发明涉及三维计算机视觉领域，尤其是一种物体位姿测量方法、装置及存储介质。The invention relates to the field of three-dimensional computer vision, in particular to an object pose measurement method, device and storage medium.

背景技术Background technique

近年来随着产业升级的发展，制造业自动化成为经济发展的重要驱动，而制造业中的机器人自动分拣物体又是制造业自动化的重要表现。物体在三维空间中的位姿是机器人识别、定位、抓取、操纵物体的重要参考。获取物体位姿的过程称为物体6维姿态测量，这是三维计算机视觉领域中的重要问题。一个物体从某个参考坐标系下的A处经过旋转平移变换到B处，这个旋转平移过程记为T_AB，T_AB由x、y、z共3个平移量和φ、χ、ψ共3个关于坐标轴的旋转角构成，总共有6个自由度，因此T_AB称为该物体的6维姿态，即物体位姿。In recent years, with the development of industrial upgrading, manufacturing automation has become an important driver of economic development, and the automatic sorting of objects by robots in manufacturing is an important manifestation of manufacturing automation. The pose of an object in three-dimensional space is an important reference for robots to identify, locate, grasp, and manipulate objects. The process of obtaining the pose of an object is called object 6D pose measurement, which is an important problem in the field of 3D computer vision. An object is transformed from A in a certain reference coordinate system _to B through rotation and translation. This rotation and translation process is recorded as T _AB . There are 6 degrees of freedom in total, so T _AB is called the 6-dimensional attitude of the object, that is, the object pose.

一种基于点对特征(Point Pair Feature,PPF)的方法(Drost et al.ModelGlobally,Match Locally:Efficient and Robust 3D Object Recognition.In:Conference on Computer Vision and Pattern Recognition(2010))被广泛应用于物体姿态测量。该方法构建了整个三维模型的全局特征，然后提取场景中的特征进行匹配。建模阶段使用了模型的所有点云，有利于表征整个三维模型表面信息。该方法使用了一种4维特征来表征位于模型表面的两个点之间的信息，该特征由两点间距离、两点法向量的夹角、两点法向量与两点间距离向量的夹角构成，简称为点对特征(Point Pair Feature,PPF)。PPF需要量化后存储于哈希表中，方便后续快速查找。模型和场景数据都构造出这些特征来进行配对，投票获取到一些候选的6维姿态。接下来对这些候选姿态进行聚类，将相似姿态聚合在一起并求平均，来获得更准确的姿态。接下来使用迭代最近点算法(Iterative ClosetPoint,ICP)对姿态进行精细化ICP配准，提高姿态的精度。A method based on Point Pair Feature (PPF) (Drost et al. Model Globally, Match Locally: Efficient and Robust 3D Object Recognition. In: Conference on Computer Vision and Pattern Recognition (2010)) is widely used in objects attitude measurement. This method constructs the global features of the whole 3D model, and then extracts the features in the scene for matching. In the modeling stage, all point clouds of the model are used, which is beneficial to characterize the surface information of the entire 3D model. This method uses a 4-dimensional feature to represent the information between two points on the surface of the model. Angle composition, referred to as point pair feature (Point Pair Feature, PPF). PPF needs to be quantized and stored in the hash table to facilitate subsequent quick search. Both the model and scene data construct these features for pairing, and vote to obtain some candidate 6-dimensional poses. Next, these candidate poses are clustered, and similar poses are aggregated and averaged to obtain a more accurate pose. Next, the Iterative Closet Point (ICP) algorithm is used to refine the ICP registration of the pose to improve the accuracy of the pose.

现有的基于PPF的方法存在着(1)采样方法过度简化的缺点；按照一定大小的栅格对模型采样时，同一栅格内的点云被简单的求平均，当该栅格内的点云的法向量间的夹角存在较大变换时，采样方式就会丢失较多表面变化的关键信息，降低了模型表面差异信息的表达能力。(2)计算量大的缺点；模型点云经过采样后需要对所有点对进行特征计算，但实际场景中物体在任意视角下的部分往往比模型直径小(模型直径指的是包围模型的边框的对角线长度)，存在着计算冗余。(3)对点云噪声缺少鲁棒性的缺点；相机拍摄到的点云本身就存在噪声，噪声会使得点云的位置和法向量出现一定偏差，计算出来场景特征会出现一定的偏差，该方法无法对噪声偏差进行补偿。(4)无法综合利用彩色-深度(RGB-D)图像的缺点；该方法只在深度图像上运行，只使用了场景的三维空间信息，无法从彩色图像获取某些信息辅助姿态测量。The existing PPF-based methods have (1) the disadvantage of oversimplification of the sampling method; when the model is sampled according to a grid of a certain size, the point cloud in the same grid is simply averaged, and when the points in the grid When there is a large change in the angle between the normal vectors of the cloud, the sampling method will lose more key information of surface changes, which reduces the ability to express the surface difference information of the model. (2) Disadvantage of large amount of calculation; after the model point cloud is sampled, it is necessary to perform feature calculation on all point pairs, but in the actual scene, the part of the object under any viewing angle is often smaller than the model diameter (the model diameter refers to the border surrounding the model Diagonal length of ), there is computational redundancy. (3) The lack of robustness to point cloud noise; the point cloud captured by the camera itself has noise, which will cause a certain deviation between the position and normal vector of the point cloud, and the calculated scene features will have a certain deviation. The method cannot compensate for noise bias. (4) The shortcomings of the color-depth (RGB-D) image cannot be comprehensively utilized; this method only operates on the depth image, only uses the three-dimensional space information of the scene, and cannot obtain some information from the color image to assist attitude measurement.

发明内容Contents of the invention

针对上述至少一个技术问题，本发明的目的在于提供一种物体位姿测量方法、装置及存储介质。In view of at least one of the above technical problems, the object of the present invention is to provide a method, device and storage medium for measuring the pose and orientation of an object.

本发明所采取的技术方案是：一方面，本发明实施例包括一种物体位姿测量方法，包括离线建模阶段和在线匹配阶段；The technical solution adopted by the present invention is: On the one hand, the embodiment of the present invention includes a method for measuring the pose of an object, including an offline modeling stage and an online matching stage;

所述离线建模阶段包括：The offline modeling phase includes:

输入物体的三维模型，所述三维模型包含模型点云坐标和模型点云法向量；The three-dimensional model of input object, described three-dimensional model comprises model point cloud coordinates and model point cloud normal vector;

对所述模型点云坐标和模型点云法向量进行采样；Sampling the model point cloud coordinates and the model point cloud normal vector;

在采样得到的模型点云坐标和模型点云法向量中构建特征集，计算模型点对特征；Construct a feature set in the sampled model point cloud coordinates and model point cloud normal vectors, and calculate the model point pair features;

将提取到的所述模型点对特征存储到哈希表中；Store the extracted model point pair features in a hash table;

所述在线匹配阶段包括：The online matching phase includes:

输入深度图像，根据相机内参计算出所述深度图像每个像素点对应的场景点云坐标，并根据所述点云坐标计算出场景点云法向量；Input the depth image, calculate the scene point cloud coordinates corresponding to each pixel of the depth image according to the camera internal reference, and calculate the scene point cloud normal vector according to the point cloud coordinates;

对所述场景点云坐标和场景点云法向量进行采样；Sampling the scene point cloud coordinates and the scene point cloud normal vector;

在采样得到的场景点云坐标和场景点云法向量中构建特征集，计算场景点对特征；Construct a feature set from the sampled scene point cloud coordinates and scene point cloud normal vectors, and calculate scene point pair features;

对提取到的所述场景点对特征进行量化并与存储在哈希表中的所述模型点对特征匹配，获取多个候选物体位姿；Quantify the extracted scene point pair features and match them with the model point pair features stored in the hash table to obtain a plurality of candidate object poses;

输入彩色图像提取场景边缘点云；Input the color image to extract the scene edge point cloud;

根据所述场景边缘点云对所述候选物体位姿进行筛选；Filtering the candidate object poses according to the edge point cloud of the scene;

将筛选得到的候选物体位姿进行聚类，得到多个初步物体位姿；Cluster the screened candidate object poses to obtain multiple preliminary object poses;

使用迭代最近点算法对所述初步物体位姿进行配准，得到最终物体位姿。The iterative closest point algorithm is used to register the preliminary object pose to obtain the final object pose.

进一步地，所述对所述模型点云坐标和模型点云法向量进行采样这一步骤，具体包括：Further, the step of sampling the model point cloud coordinates and the model point cloud normal vector specifically includes:

根据模型点云坐标，计算出包围模型点云的边界框，得到模型点云空间；According to the model point cloud coordinates, calculate the bounding box surrounding the model point cloud, and obtain the model point cloud space;

对所述模型点云空间进行栅格化，得到多个大小相等的栅格，每个所述栅格包含多个点云，每个点云包含对应的模型点云坐标和模型点云法向量；Rasterize the model point cloud space to obtain multiple grids of equal size, each of which contains multiple point clouds, and each point cloud contains corresponding model point cloud coordinates and model point cloud normal vectors ;

对每个所述栅格中包含的点云根据模型点云法向量之间的夹角的大小进行聚类；Clustering the point clouds contained in each grid according to the size of the angle between the model point cloud normal vectors;

对每个聚类中的模型点云坐标和模型点云法向量求平均值，得到每个栅格采样后的模型点云坐标和模型点云法向量。The model point cloud coordinates and model point cloud normal vectors in each cluster are averaged to obtain the model point cloud coordinates and model point cloud normal vectors after each raster sampling.

进一步地，所述在采样得到的模型点云坐标和模型点云法向量中构建特征集，计算模型点对特征这一步骤，具体包括：Further, the step of constructing a feature set in the sampled model point cloud coordinates and model point cloud normal vectors and calculating model point pair features specifically includes:

对采样得到的模型点云坐标构造K-D树；Construct a K-D tree for the sampled model point cloud coordinates;

选取参考点，所述参考点为采样得到的模型点云坐标中的任意一点；Selecting a reference point, the reference point being any point in the model point cloud coordinates obtained by sampling;

在所述K-D树中查找目标点，所述目标点为与所述参考点距离小于第一阈值的点；Searching for a target point in the K-D tree, where the target point is a point whose distance from the reference point is less than a first threshold;

依次计算所述参考点与目标点构成的模型点对特征。The model point pair features formed by the reference point and the target point are calculated sequentially.

进一步地，所述将提取到的所述模型点对特征存储到哈希表中这一步骤，具体包括：Further, the step of storing the extracted model point pair features into a hash table specifically includes:

对提取到的所述模型点对特征进行量化处理；Carrying out quantitative processing on the extracted features of the model points;

将量化结果通过哈希函数求出一个键值，作为所述点对特征在哈希表中的索引；Obtaining a key value from the quantization result through a hash function as an index of the point pair feature in the hash table;

将具有相同索引的点对特征存储在哈希表的同一个桶中，不同索引的点对特征存储在哈希表的不同桶中。Point-pair features with the same index are stored in the same bucket of the hash table, and point-pair features with different indexes are stored in different buckets of the hash table.

进一步地，所述对提取到的所述场景点对特征进行量化并与存储在哈希表中的所述模型点对特征匹配，获取多个候选物体位姿这一步骤，具体包括：Further, the step of quantifying the extracted feature of the scene point pair and matching with the feature of the model point pair stored in the hash table, and obtaining a plurality of candidate object poses specifically includes:

将提取到的所述场景点对特征进行量化处理；Quantifying the features of the extracted scene points;

对量化结果进行扩充，以补偿噪声引起的特征偏移；Extend the quantization results to compensate for feature shifts caused by noise;

将扩充后的多个结果值作为键值，在哈希表中寻找具有相同键值的模型点对特征；Use the expanded multiple result values as key values, and look for model point pair features with the same key value in the hash table;

根据所述模型点对特征，获取多个候选物体位姿。According to the point pair features of the model, multiple candidate object poses are obtained.

进一步地，所述输入彩色图像提取场景边缘点云这一步骤，具体包括：Further, the step of extracting scene edge point cloud from the input color image specifically includes:

将所述彩色图像进行灰度化；Grayscale the color image;

使用边缘检测器对灰度化后的图像进行边缘检测；Use an edge detector to perform edge detection on the grayscaled image;

将位于图像边缘处的像素与深度图像一一对应，并根据相机内参计算出像素点的空间坐标；Correspond the pixels at the edge of the image with the depth image one by one, and calculate the spatial coordinates of the pixels according to the internal parameters of the camera;

提取所述空间坐标作为场景边缘点云。The spatial coordinates are extracted as a scene edge point cloud.

进一步地，所述根据所述场景边缘点云对所述候选物体位姿进行筛选这一步骤，具体包括：Further, the step of screening the candidate object poses according to the scene edge point cloud specifically includes:

根据相机内参，将所述候选物体位姿对应的物体三维模型投射到成像平面，获取所述三维模型的边缘点云；Projecting the three-dimensional model of the object corresponding to the pose of the candidate object to the imaging plane according to the camera internal reference, and obtaining the edge point cloud of the three-dimensional model;

在所述三维模型的边缘点云中选取任意一点作为参考点，在所述场景边缘点云中找出对应的匹配点，所述匹配点为距离所述参考点最近的点；Select any point in the edge point cloud of the three-dimensional model as a reference point, find a corresponding matching point in the scene edge point cloud, and the matching point is the point closest to the reference point;

计算第一距离，所述第一距离为匹配点到参考点的距离，若所述第一距离小于第二阈值，则保留所述匹配点，否则舍去所述匹配点；Calculate the first distance, the first distance is the distance from the matching point to the reference point, if the first distance is less than the second threshold, then keep the matching point, otherwise discard the matching point;

计算被保留的匹配点的点数占所述三维模型的边缘点云中的总点数的比例，若所述比例大于第三阈值，则保留所述三维模型对应的候选物体位姿，否则舍弃。Calculate the ratio of the number of retained matching points to the total number of points in the edge point cloud of the 3D model, if the ratio is greater than a third threshold, keep the candidate object pose corresponding to the 3D model, otherwise discard it.

进一步地，所述将筛选得到的候选物体位姿进行聚类，得到多个初步物体位姿这一步骤，具体包括：Further, the step of clustering the screened candidate object poses to obtain multiple preliminary object poses specifically includes:

选取筛选得到的候选物体位姿中的任意一个为第一候选物体位姿；Selecting any one of the selected candidate object poses as the first candidate object pose;

分别计算所述第一候选物体位姿与筛选得到的其他候选无物体位姿之间的距离；Calculate the distance between the first candidate object pose and other candidate object-free poses obtained through screening;

将所述筛选得到的候选物体位姿各自初始化为相应个数的聚类；Initializing the candidate object poses obtained by the screening to a corresponding number of clusters;

按照层次化聚类的方法将筛选得到的候选物体位姿进行聚类；According to the hierarchical clustering method, the screened candidate object poses are clustered;

在每个聚类中提取投票分数最高的候选物体位姿，得到多个初步物体姿态。In each cluster, the candidate object pose with the highest voting score is extracted to obtain multiple preliminary object poses.

另一方面，本发明实施例还包括一种物体位姿测量的装置，包括存储器和处理器，所述存储器用于存储至少一个程序，所述处理器用于加载所述至少一个程序以执行所述的一种物体位姿测量方法。On the other hand, the embodiment of the present invention also includes a device for measuring the pose of an object, including a memory and a processor, the memory is used to store at least one program, and the processor is used to load the at least one program to execute the A method for measuring the pose of an object.

另一方面，本发明实施例还包括一种存储介质，其中存储有处理器可执行的指令，所述处理器可执行的指令在由处理器执行时用于执行所述的一种物体位姿测量方法。On the other hand, the embodiment of the present invention also includes a storage medium, which stores processor-executable instructions, and the processor-executable instructions are used to execute the object pose when executed by the processor. Measurement methods.

本发明的有益效果是：(1)本发明提供了一种更加高效的模型采样策略，减少了点云数量，从而能够减少后续运算量；又能够保留足够的物体表面变化信息；(2)限定了计算点对特征时的距离范围，减少了模型和场景数据的点对特征计算量，也降低了过多背景点云的匹配干扰；(3)提出了量化扩充方法，减少了噪声对点对特征计算产生偏移的影响；(4)引入了彩色图像信息，增加了输入信息，并从彩色图像提取边缘信息，筛选候选物体位姿并进行ICP配准，提升了测量精度，从而对于遮挡、聚集、堆叠等情况下的场景识别率更加准确。The beneficial effects of the present invention are: (1) the present invention provides a more efficient model sampling strategy, which reduces the number of point clouds, thereby reducing the amount of follow-up calculations; and retaining sufficient surface change information; (2) limiting The distance range when calculating point pair features is reduced, the calculation amount of point pair features of model and scene data is reduced, and the matching interference of too many background point clouds is also reduced; (3) a quantitative expansion method is proposed to reduce noise on point pair Feature calculation produces offset effects; (4) Introduce color image information, increase input information, and extract edge information from color images, screen candidate object poses and perform ICP registration to improve measurement accuracy, thus for occlusion, The scene recognition rate in the case of aggregation, stacking, etc. is more accurate.

附图说明Description of drawings

图1为实施例所述一种物体位姿测量方法的流程示意图；Fig. 1 is a schematic flow chart of an object pose measurement method described in an embodiment;

图2为实施例所述离线建模阶段的步骤流程图；Fig. 2 is the flow chart of the steps of the off-line modeling stage described in the embodiment;

图3为实施例所述在线匹配阶段的步骤流程图。Fig. 3 is a flow chart of steps in the online matching stage described in the embodiment.

具体实施方式Detailed ways

如图1所述，本实施例包括一种物体位姿测量方法，包括离线建模阶段和在线匹配阶段；As shown in Figure 1, this embodiment includes a method for measuring the pose of an object, including an offline modeling stage and an online matching stage;

所述离线建模阶段包括：The offline modeling phase includes:

所述在线匹配阶段包括：The online matching phase includes:

本实施例中，离线建模阶段主要是对物体三维模型进行特征建模，并存储起来以供后续场景姿态测量使用；而在线匹配阶段主要是对给定的场景RGB-D图像进行物体位姿测量。In this embodiment, the offline modeling stage is mainly to perform feature modeling on the 3D model of the object and store it for subsequent scene attitude measurement; while the online matching stage is mainly to perform object pose measurement on a given scene RGB-D image. Measurement.

进一步，参照图2，离线建模阶段包括以下步骤：Further, referring to Figure 2, the offline modeling phase includes the following steps:

S1.输入物体的三维模型，所述三维模型包含模型点云坐标和模型点云法向量；S1. Input the three-dimensional model of the object, the three-dimensional model includes the coordinates of the model point cloud and the normal vector of the model point cloud;

S2.对所述模型点云坐标和模型点云法向量进行采样；S2. Sampling the model point cloud coordinates and the model point cloud normal vector;

S3.在采样得到的模型点云坐标和模型点云法向量中构建特征集，计算模型点对特征；S3. Construct a feature set in the sampled model point cloud coordinates and model point cloud normal vectors, and calculate model point pair features;

S4.将提取到的所述模型点对特征存储到哈希表中。S4. Store the extracted model point pair features in a hash table.

本实施例中，对于离线建模阶段，由于无需输入场景图像，故称为离线建模阶段；本阶段的步骤S1，也就是输入物体的三维模型，所述三维模型无需带有纹理、颜色等信息，只需要保留点云坐标和点云法向量，例如计算机建模得到的CAD模型或者三维重建技术得到的3D模型，该过程能够降低三维模型建模的复杂度。In this embodiment, for the offline modeling stage, since there is no need to input scene images, it is called the offline modeling stage; step S1 of this stage is to input the 3D model of the object, and the 3D model does not need to have textures, colors, etc. Information, only need to keep point cloud coordinates and point cloud normal vector, such as CAD model obtained by computer modeling or 3D model obtained by 3D reconstruction technology, this process can reduce the complexity of 3D model modeling.

步骤S2，也就是对所述模型点云坐标和模型点云法向量进行采样这一步骤，具体包括以下步骤：Step S2, that is, the step of sampling the model point cloud coordinates and the model point cloud normal vector, specifically includes the following steps:

S201.根据模型点云坐标，计算出包围模型点云的边界框，得到模型点云空间；S201. According to the model point cloud coordinates, calculate the bounding box surrounding the model point cloud, and obtain the model point cloud space;

S202.对所述模型点云空间进行栅格化，得到多个大小相等的栅格，每个所述栅格包含多个点云，每个点云包含对应的模型点云坐标和模型点云法向量；S202. Rasterize the model point cloud space to obtain multiple grids of equal size, each grid contains multiple point clouds, and each point cloud contains corresponding model point cloud coordinates and model point clouds normal vector;

S203.对每个所述栅格中包含的点云根据模型点云法向量之间的夹角的大小进行聚类；S203. Clustering the point clouds contained in each grid according to the size of the angle between the model point cloud normal vectors;

S204.对每个聚类中的模型点云坐标和模型点云法向量求平均值，得到每个栅格采样后的模型点云坐标和模型点云法向量。S204. Average the model point cloud coordinates and model point cloud normal vectors in each cluster to obtain the model point cloud coordinates and model point cloud normal vectors of each grid sampled.

本实施例中，按照模型点云坐标在X，Y，Z轴上的最大最小值计算出包围点云的边界框，得到模型点云空间，边界框的对角线直径记为模型直径d_m；将模型点云空间栅格化，每个栅格为一个小正方体，且每个栅格大小相等。栅格的大小设置为τ×d_m，τ为采样系数，设置为0.05；每个栅格中包含多个点云，每个点云包含对应的模型点云坐标和模型点云法向量；对于每个栅格，我们对其中的点云按照点云法向量之间的夹角大小进行聚类，所述夹角大小不超过阈值Δθ属于同一聚类，然后将每个聚类中的模型点云坐标和模型点云法向量求平均值，就得到每个栅格采样后的模型点云坐标和模型点云法向量。对于模型空间中的所有栅格都采用该采样策略，其中，Δθ一般设为

通过这种采样策略，既减少了后续运算量，又降低了采样策略丢失表面变化信息的影响，且点对之间的辨识度信息也能够被保留下来。In this embodiment, the bounding box surrounding the point cloud is calculated according to the maximum and minimum values of the model point cloud coordinates on the X, Y, and Z axes to obtain the model point cloud space, and the diagonal diameter of the bounding box is recorded as the model diameter d _m ; Rasterize the model point cloud space, each grid is a small cube, and each grid has the same size. The size of the grid is set to τ×d _m , and τ is the sampling coefficient, which is set to 0.05; each grid contains multiple point clouds, and each point cloud contains the corresponding model point cloud coordinates and model point cloud normal vector; for For each grid, we cluster the point clouds in it according to the angle between the normal vectors of the point cloud, and the angle does not exceed the threshold Δθ to belong to the same cluster, and then the model points in each cluster Cloud coordinates and model point cloud normal vectors are averaged to obtain the model point cloud coordinates and model point cloud normal vectors after each grid sampling. This sampling strategy is adopted for all grids in the model space, where Δθ is generally set to

Through this sampling strategy, it not only reduces the amount of follow-up calculations, but also reduces the influence of the sampling strategy on the loss of surface change information, and the identification information between point pairs can also be preserved.

步骤S3，也就是在采样得到的模型点云坐标和模型点云法向量中构建特征集，计算模型点对特征这一步骤，具体包括以下步骤：Step S3, that is, the step of constructing a feature set from the sampled model point cloud coordinates and model point cloud normal vectors, and calculating the model point pair features, specifically includes the following steps:

S301.对采样得到的模型点云坐标构造K-D树；S301. Construct a K-D tree for the sampled model point cloud coordinates;

S302.选取参考点，所述参考点为采样得到的模型点云坐标中的任意一点；S302. Select a reference point, where the reference point is any point in the sampled model point cloud coordinates;

S303.在所述K-D树中查找目标点，所述目标点为与所述参考点距离小于第一阈值的点；S303. Search for a target point in the K-D tree, where the target point is a point whose distance from the reference point is less than a first threshold;

S304.依次计算所述参考点与目标点构成的模型点对特征。S304. Sequentially calculate the model point pair features formed by the reference point and the target point.

本实施例中，对采样得到的的模型点云坐标构造K-D树，以方便后续进行快速距离搜索；对于采样后的每个点，一一记为参考点，然后在K-D树中查找目标点，所述目标点为与参考点距离小于d_range的点，此处的d_range便为第一阈值，依次计算所述参考点与目标点构成的模型点对特征。其具体的计算过程为：记参考点坐标为m_r，参考点对应的法向量为n_r，目标点为m_s，目标点对应的法向量为n_s，参考点与目标点构成的模型点对特征F_r,s＝(||d_r,s||,∠(n_r,d_r,s),∠(n_s,d_r,s),∠(n_r,n_s))，其中d_r,s表示从点m_r到点m_s的距离向量，||d_r,s||表示d_r,s的距离，∠(n_r,d_r,s)表示法向量为n_r与距离向量d_r,s的夹角，依次类推，∠(n_s,d_r,s)表示法向量n_s与距离向量d_r,s的夹角，∠(n_r,n_s)表示法向量n_r与法向量n_s的夹角。其中，第一阈值

d_min和d_med分别是模型边界框的两条短边。在实际场景的大多数视角下，模型的可见部分长度往往小于d_m但又大于d_min，因此对场景中距离较远的点对构建特征时，会把较多的背景点云算进去，增加算法的误识别率，也增加了构建特征的计算量，因此，本实施例中选用d_range作为距离上限来减少上述影响，也就是说，该过程限定了计算点对特征时的距离范围，减少了模型和场景数据的点对特征计算量，也降低了过多背景点云的匹配干扰。In this embodiment, a KD tree is constructed for the sampled model point cloud coordinates to facilitate the subsequent fast distance search; for each point after sampling, record it as a reference point one by one, and then search for the target point in the KD tree, The target point is a point whose distance from the reference point is less than d _range , where d _range is the first threshold, and the model point pair features formed by the reference point and the target point are calculated in sequence. The specific calculation process is: record the coordinates of the reference point as m _r , the normal vector corresponding to the reference point is n _r , the target point is m _s , the normal vector corresponding to the target point is n _s , and the model point formed by the reference point and the target point For feature F _r,s ＝(||d _r,s ||,∠(n _r ,d _r,s ),∠(n _s ,d _r,s ),∠(n _r ,n _s )), where d _{r, s} represents the distance vector from point m _r to point m _s , ||d _{r, s} || represents the distance of d _{r, s} , ∠(n _r , d _{r, s} ) represents the normal vector between n _r and The angle between the distance vector d _{r, s} , and so on, ∠(n _s , d _{r, s} ) represents the angle between the normal vector n _s and the distance vector d _{r, s} , ∠(n _r , n _s ) represents the normal vector The angle between n _r and the normal vector n _s . Among them, the first threshold

d _min and d _med are the two short sides of the model bounding box, respectively. In most viewing angles of the actual scene, the length of the visible part of the model is often less than d _m but greater than d _min , so when constructing features for point pairs with far distances in the scene, more background point clouds will be included, increasing The misrecognition rate of the algorithm also increases the amount of calculation for constructing features. Therefore, in this embodiment, d _range is selected as the upper limit of the distance to reduce the above-mentioned impact. That is to say, this process limits the distance range when calculating the point pair feature, reducing It reduces the amount of point-to-feature calculations for model and scene data, and also reduces the matching interference of too many background point clouds.

步骤S4，也就是将提取到的所述模型点对特征存储到哈希表中这一步骤，具体包括以下步骤：Step S4, that is, the step of storing the extracted model point pair features into a hash table, specifically includes the following steps:

S401.对提取到的所述模型点对特征进行量化处理；S401. Quantify the extracted model point pair features;

S402.将量化结果通过哈希函数求出一个键值，作为所述点对特征在哈希表中的索引；S402. Obtain a key value from the quantization result through a hash function, and use it as an index of the point pair feature in the hash table;

S403.将具有相同索引的点对特征存储在哈希表的同一个桶中，不同索引的点对特征存储在哈希表的不同桶中。S403. Store point pair features with the same index in the same bucket of the hash table, and store point pair features with different indexes in different buckets of the hash table.

本实施例中，对于提取到的所述模型点对特征F_r,s＝(||d_r,s||,∠(n_r,d_r,s),∠(n_s,d_r,s),∠(n_r,n_s))进行量化，得到

其中一般设置Δd＝0.05d_m，

将量化后的值Q_r,s通过哈希函数求出一个键值，所述键值为一个非负整数，作为该点对特征在哈希表中的索引，具有相同索引的点对特征存储在哈希表的同一个桶中，不同索引的点对6特征则存储于哈希表的不同桶中。In this embodiment, for the extracted model point pair feature F _r,s =(||d _r,s ||,∠(n _r ,d _r,s ),∠(n _s ,d _r,s ),∠(n _r ,n _s )) to quantify, and get

where Δd=0.05d _m is generally set,

Calculate the quantized value Q _{r, s} through a hash function to obtain a key value, the key value is a non-negative integer, as the index of the point pair feature in the hash table, and store the point pair feature with the same index In the same bucket of the hash table, point-to-6 features of different indexes are stored in different buckets of the hash table.

关于在线匹配阶段，需要分别输入场景的彩色图像和深度图像，参照图3，该阶段包括以下步骤：Regarding the online matching stage, the color image and the depth image of the scene need to be input respectively. Referring to Figure 3, this stage includes the following steps:

D1.输入深度图像，根据相机内参计算出所述深度图像每个像素点对应的场景点云坐标，并根据所述点云坐标计算出场景点云法向量；D1. Input the depth image, calculate the scene point cloud coordinates corresponding to each pixel of the depth image according to the camera internal parameters, and calculate the scene point cloud normal vector according to the point cloud coordinates;

D2.对所述场景点云坐标和场景点云法向量进行采样；D2. Sampling the scene point cloud coordinates and the scene point cloud normal vector;

D3.在采样得到的场景点云坐标和场景点云法向量中构建特征集，计算场景点对特征；D3. Construct a feature set from the sampled scene point cloud coordinates and scene point cloud normal vectors, and calculate scene point pair features;

D4.对提取到的所述场景点对特征进行量化并与存储在哈希表中的所述模型点对特征匹配，获取多个候选物体位姿；D4. Quantify the extracted scene point pair features and match them with the model point pair features stored in the hash table to obtain a plurality of candidate object poses;

D5.输入彩色图像提取场景边缘点云；D5. Input the color image to extract the scene edge point cloud;

D6.根据所述场景边缘点云对所述候选物体位姿进行筛选；D6. Screening the candidate object poses according to the scene edge point cloud;

D7.将筛选得到的候选物体位姿进行聚类，得到多个初步物体位姿；D7. Clustering the screened candidate object poses to obtain multiple preliminary object poses;

D8.使用迭代最近点算法对所述初步物体位姿进行配准，得到最终物体位姿。D8. Using the iterative closest point algorithm to register the preliminary object pose to obtain the final object pose.

步骤D1中，根据相机成像公式：

其中，u、v是点在成像平面的坐标，X、Y、Z是点在相机坐标系下的3维坐标，f_x、f_y、c_x、c_y是相机内参。因此依照相机内参可以计算出输入的深度图像中每个像素点对应的3维空间坐标，也就是场景点云坐标，并根据所述场景点云坐标可估算出相应的场景点云法向量。In step D1, according to the camera imaging formula:

Among them, u, v are the coordinates of the point on the imaging plane, X, Y, Z are the 3-dimensional coordinates of the point in the camera coordinate system, and f _x , f _y , c _x , _cy are internal camera parameters. Therefore, the 3D space coordinates corresponding to each pixel in the input depth image can be calculated according to the internal camera parameters, that is, the scene point cloud coordinates, and the corresponding scene point cloud normal vector can be estimated according to the scene point cloud coordinates.

关于步骤D2，也就是对所述场景点云坐标和场景点云法向量进行采样这一过程，本实施例中采用与离线建模的步骤S2相同的采用策略，对所述场景点云坐标和场景点云法向量进行采样。本过程同样能够减少点云数量，从而能够减少后续运算量；又能够保留足够的物体表面变化信息。Regarding step D2, that is, the process of sampling the scene point cloud coordinates and scene point cloud normal vectors, in this embodiment, the same adoption strategy as step S2 of offline modeling is adopted, and the scene point cloud coordinates and scene point cloud normal vectors are sampled. The scene point cloud normal vector is sampled. This process can also reduce the number of point clouds, thereby reducing the amount of subsequent calculations; and can retain sufficient surface change information of the object.

同样地，关于步骤D3，本实施例采用与离线建模阶段中步骤S3相同的方法，计算场景点对特征；其具体包括：(1)对步骤D2采样后的场景点云坐标构造K-D树；(2)假设步骤D2采样得到的场景点云坐标数量为N，按照每n个点取一个点做为参考点，总共有

个参考点；(3)对于每个参考点，在K-D树中查找与参考点距离小于d_range的点，并构造成场景点对特征，d_range的设置与场景点对特征的计算与离线建模阶段的步骤S3相同，在此不再重复赘述。Similarly, regarding step D3, this embodiment adopts the same method as step S3 in the offline modeling stage to calculate scene point pair features; it specifically includes: (1) constructing a KD tree for the scene point cloud coordinates sampled in step D2; (2) Assuming that the number of scene point cloud coordinates obtained by sampling in step D2 is N, one point is taken as a reference point for every n points, and there are a total of

(3) For each reference point, search for a point in the KD tree whose distance from the reference point is less than d _range , and construct a scene point pair feature, the setting of d _range and the calculation and offline construction of the scene point pair feature The step S3 of the modeling stage is the same, and will not be repeated here.

步骤D4，也就是对提取到的所述场景点对特征进行量化并与存储在哈希表中的所述模型点对特征匹配，获取多个候选物体位姿这一步骤，具体包括以下步骤：Step D4, that is, the step of quantifying the extracted scene point pair features and matching with the model point pair features stored in the hash table to obtain multiple candidate object poses, specifically includes the following steps:

D401.将提取到的所述场景点对特征进行量化处理；D401. Quantify the features of the extracted scene points;

D402.对量化结果进行扩充，以补偿噪声引起的特征偏移；D402. Expand the quantization result to compensate for the feature shift caused by noise;

D403.将扩充后的多个结果值作为键值，在哈希表中寻找具有相同键值的模型点对特征；D403. Use the expanded multiple result values as key values, and look for model point pair features with the same key value in the hash table;

D404.根据所述模型点对特征，获取多个候选物体位姿。D404. Obtain multiple candidate object poses according to the model point pair features.

本实施例中个，将步骤D3计算得到的场景点对特征集F＝(F₁,F₂,F₃,F₄)进行量化，量化方式与离线建模步骤S4的量化方式相同，得到量化后的4维特征Q＝(Q₁,Q₂,Q₃,Q₄)；(2)为了减少噪声对量化匹配的影响，本实施例采用如下方式扩充量化后的值Q。设，

i＝1,2,3,4，其中e_i为第i维量化误差，F_i为F的每一维的值，Δ为量化间隔。以i＝1为例，当

时，Q_new＝(Q₁-1,Q₂,Q₃,Q₄)；当

时，Q_new＝(Q₁+1,Q₂,Q₃,Q₄)；当

时，不对Q进行扩充；按照此方式，一个场景点对特征F最多可以量化为16个量化特征，以此补偿噪声引起的特征偏移；(3)设第i个参考点为s_i，s_i构建了n_i个点对特征。对于s_i，创建一个投票矩阵，矩阵的每一行代表图像场景点云中的点，每一列代表量化的旋转角度，角度间隔一般取为

矩阵的每个坐标(m,α)表示图像场景中的点m的法向量旋转至与X轴平行，然后平移到原点之后，点m与其他点的连线还需绕着X轴旋转角度α，矩阵初始化为全0矩阵；(4)将扩充后的多个Q值作为键值，在哈希表中进行查找，寻找具有相同键值的模型点对特征。对于查找到的每个点对特征，计算与图像场景中的点m′和要旋转的角度α′，将投票矩阵

的值加1；(5)每个参考点s_i的投票过程结束后，提取出投票矩阵中最大值对应的行列坐标(m,α)，与计算得出图像场景的物体位姿，(m,α)处的值记为该姿态的分数；(6)将所有参考点计算得到的图像场景的物体位姿保留下来作为候选物体位姿。In this embodiment, the scene points calculated in step D3 are quantized to feature set F=(F ₁ , F ₂ , F ₃ , F ₄ ), the quantization method is the same as the quantization method in offline modeling step S4, and the quantization method is obtained The final 4-dimensional feature Q=(Q ₁ , Q ₂ , Q ₃ , Q ₄ ); (2) In order to reduce the impact of noise on quantization matching, this embodiment expands the quantized value Q in the following manner. set up,

i=1,2,3,4, where e _i is the quantization error of the i-th dimension, F _i is the value of each dimension of F, and Δ is the quantization interval. Take i=1 as an example, when

, Q _new ＝(Q ₁ -1,Q ₂ ,Q ₃ ,Q ₄ ); when

, Q _new ＝(Q ₁ +1,Q ₂ ,Q ₃ ,Q ₄ ); when

When , Q is not expanded; according to this method, a scene point can be quantized into 16 quantized features at most for the feature F, so as to compensate for the feature shift caused by noise; (3) Let the i-th reference point be s _i , s _i constructs n _i point pair features. For s _i , create a voting matrix. Each row of the matrix represents a point in the image scene point cloud, and each column represents a quantized rotation angle. The angle interval is generally taken as

Each coordinate (m, α) of the matrix indicates that the normal vector of point m in the image scene is rotated to be parallel to the X axis, and then translated to the origin, the connection between point m and other points needs to be rotated around the X axis by an angle α , the matrix is initialized as a matrix of all 0s; (4) Use multiple expanded Q values as key values to search in the hash table to find the model point pair features with the same key value. For each point pair feature found, calculate the point m' in the image scene and the angle α' to be rotated, and convert the voting matrix

(5) After the voting process of each reference point s _i is over, extract the row and column coordinates (m, α) corresponding to the maximum value in the voting matrix, and calculate the object pose of the image scene, (m , α) is recorded as the score of the pose; (6) The object pose of the image scene calculated from all reference points is retained as the candidate object pose.

步骤D5，也就是输入彩色图像提取场景边缘点云这一步骤，具体包括以下步骤：Step D5, that is, the step of inputting the color image to extract the point cloud of the edge of the scene, specifically includes the following steps:

D501.将所述彩色图像进行灰度化；D501. Grayscale the color image;

D502.使用边缘检测器对灰度化后的图像进行边缘检测；D502. Use an edge detector to perform edge detection on the grayscaled image;

D503.将位于图像边缘处的像素与深度图像一一对应，并根据相机内参计算出像素点的空间坐标；D503. Correspond the pixels at the edge of the image with the depth image one by one, and calculate the spatial coordinates of the pixels according to the internal parameters of the camera;

D504.提取所述空间坐标作为场景边缘点云。D504. Extracting the space coordinates as a scene edge point cloud.

本实施例中，将输入的彩色图像进行灰度化后，使用Canny边缘检测器进行边缘检测，以提取图像的边缘像素，将位于边缘处的像素与深度图像一一对应，并根据相机内参计算出像素点的空间坐标，将这些像素点称为场景边缘点云。In this embodiment, after the input color image is grayscaled, the Canny edge detector is used to perform edge detection to extract the edge pixels of the image, and the pixels at the edge are in one-to-one correspondence with the depth image, and are calculated according to the internal parameters of the camera The spatial coordinates of the pixels are obtained, and these pixels are called the scene edge point cloud.

步骤D6，也就是根据所述场景边缘点云对所述候选物体位姿进行筛选这一步骤，具体包括以下步骤：Step D6, that is, the step of screening the candidate object poses according to the scene edge point cloud, specifically includes the following steps:

D601.根据相机内参，将所述候选物体位姿对应的物体三维模型投射到成像平面，获取所述三维模型的边缘点云；D601. According to the camera internal reference, project the three-dimensional model of the object corresponding to the pose of the candidate object to the imaging plane, and obtain the edge point cloud of the three-dimensional model;

D602.在所述三维模型的边缘点云中选取任意一点作为参考点，在所述场景边缘点云中找出对应的匹配点，所述匹配点为距离所述参考点最近的点；D602. Select any point in the edge point cloud of the three-dimensional model as a reference point, and find a corresponding matching point in the scene edge point cloud, and the matching point is the point closest to the reference point;

D603.计算第一距离，所述第一距离为匹配点到参考点的距离，若所述第一距离小于第二阈值，则保留所述匹配点，否则舍去所述匹配点；D603. Calculate the first distance, the first distance is the distance from the matching point to the reference point, if the first distance is smaller than the second threshold, keep the matching point, otherwise discard the matching point;

D604.计算被保留的匹配点的点数占所述三维模型的边缘点云中的总点数的比例，若所述比例大于第三阈值，则保留所述三维模型对应的候选物体位姿，否则舍弃。D604. Calculate the ratio of the number of retained matching points to the total number of points in the edge point cloud of the 3D model, if the ratio is greater than the third threshold, keep the candidate object pose corresponding to the 3D model, otherwise discard .

本实施例中，对于步骤D4获得的每个候选物体位姿，将所述候选物体位姿对应的三维模型依据相机内参投射到成像平面，得到各个候选物体位姿姿下对应的三维模型的边缘点云；对三维模型的边缘点云中的每个点，在通过步骤D5提取得到的场景边缘点云中寻找距离最近的点，如果该最短距离小于d_ε，即所述第二阈值，第二阈值d_ε一般设置为0.1d_m，则称该点被正确匹配，并称为匹配点；统计被保留的匹配点的点数占所述三维模型的边缘点云中的总点数的比例，若所述比例大于第三阈值，则保留所述三维模型对应的候选物体位姿，否则舍弃。In this embodiment, for each candidate object pose obtained in step D4, the 3D model corresponding to the candidate object pose is projected onto the imaging plane according to the camera internal parameters, and the edge of the corresponding 3D model under each candidate object pose is obtained Point cloud; for each point in the edge point cloud of the three-dimensional model, find the nearest point in the scene edge point cloud extracted by step D5, if the shortest distance is less than d _ε , that is, the second threshold, the first The second threshold d _ε is generally set to 0.1d _m , then the point is said to be correctly matched and called a matching point; the ratio of the number of retained matching points to the total number of points in the edge point cloud of the 3D model is counted, if If the ratio is greater than the third threshold, the candidate object pose corresponding to the 3D model is kept, otherwise discarded.

步骤D7，也就是将筛选得到的候选物体位姿进行聚类，得到多个初步物体位姿这一步骤，具体包括以下步骤：Step D7, that is, the step of clustering the screened candidate object poses to obtain multiple preliminary object poses, specifically includes the following steps:

D701.选取筛选得到的候选物体位姿中的任意一个为第一候选物体位姿；D701. Selecting any one of the selected candidate object poses as the first candidate object pose;

D702.分别计算所述第一候选物体位姿与筛选得到的其他候选无物体位姿之间的距离；D702. Calculate the distances between the first candidate object pose and other candidate object-free poses obtained through screening;

D703.将所述筛选得到的候选物体位姿各自初始化为相应个数的聚类；D703. Initialize the candidate object poses obtained through the screening into clusters of corresponding numbers;

D704.按照层次化聚类的方法将筛选得到的候选物体位姿进行聚类；D704. Clustering the screened candidate object poses according to the method of hierarchical clustering;

D705.在每个聚类中提取投票分数最高的候选物体位姿，得到多个初步物体姿态。D705. Extract the candidate object pose with the highest voting score in each cluster to obtain multiple preliminary object poses.

本实施例中，对于步骤D6筛选后得到的候选物体位姿，计算两两之间的距离，姿态间的距离D有两部分组成：位移量差值Δdist和旋转量差值Δrot，其中Δdist<d_cluster且Δrot<rot_cluster，即位移量差值Δdist要小于聚类间的空间距离阈值d_cluster，空间距离阈值d_cluster一般设置为物体直径的十分之一，旋转量差值Δrot要小于聚类间的旋转角度阈值rot_cluster，旋转角度阈值rot_cluster一般设置为30度；如果步骤D6筛选后得到的候选物体位姿有N个，将N个位姿各自初始化为N个聚类，按照层次化聚类方法将相邻的聚类聚合在一起，直至两两聚类间的距离均大于阈值时，聚类结束；将每个聚类中投票数最高的候选物体位姿提取出来，得到多个初步物体姿态。In this embodiment, for the candidate object poses obtained after screening in step D6, the distance between each pair is calculated. The distance D between poses consists of two parts: displacement difference Δdist and rotation difference Δrot, where Δdist< d _cluster and Δrot<rot _cluster , that is, the displacement difference Δdist is smaller than the spatial distance threshold d _cluster between clusters, the spatial distance threshold d _cluster is generally set to one-tenth of the object diameter, and the rotation difference Δrot is smaller than the cluster The rotation angle threshold rot _cluster between classes is generally set to 30 degrees; if there are N candidate object _poses obtained after screening in step D6, each of the N poses is initialized into N clusters, according to the hierarchy The clustering method aggregates adjacent clusters together until the distance between any two clusters is greater than the threshold, and the clustering ends; the candidate object pose with the highest number of votes in each cluster is extracted, and multiple a preliminary object pose.

最后，对于步骤D7得到的多个初步物体位姿，本实施例进一步获取各个初步物体位姿下的模型轮廓点云，将每个所述轮廓点云分别于步骤D5提取得到的场景边缘点云进行ICP配准，获得最后的精确度高的最终物体位姿。Finally, for the multiple preliminary object poses obtained in step D7, this embodiment further obtains the model contour point cloud under each preliminary object pose, and extracts each of the contour point clouds in step D5 to obtain the scene edge point cloud Perform ICP registration to obtain the final object pose with high accuracy.

综上所述，本发明实施例中的一种物体位姿测量方法具有以下优点：In summary, an object pose measurement method in the embodiment of the present invention has the following advantages:

(1)本发明实施例提供了一种更加高效的模型采样策略，减少了点云数量，从而能够减少后续运算量；又能够保留足够的物体表面变化信息；(2)限定了计算点对特征时的距离范围，减少了模型和场景数据的点对特征计算量，也降低了过多背景点云的匹配干扰；(3)提出了量化扩充方法，减少了噪声对点对特征计算产生偏移的影响；(4)引入了彩色图像信息，增加了输入信息，并从彩色图像提取边缘信息，筛选候选物体位姿并进行ICP配准，提升了测量精度，从而对于遮挡、聚集、堆叠等情况下的场景识别率更加准确。(1) The embodiment of the present invention provides a more efficient model sampling strategy, which reduces the number of point clouds, thereby reducing the amount of subsequent calculations; and retaining sufficient surface change information; (2) limiting the calculation of point pair features The distance range of time reduces the amount of point-to-feature calculations of the model and scene data, and also reduces the matching interference of too many background point clouds; (3) A quantitative expansion method is proposed to reduce the offset of noise to point-to-feature calculations (4) Introduce color image information, increase input information, and extract edge information from color images, screen candidate object poses and perform ICP registration, improve measurement accuracy, and thus for occlusion, aggregation, stacking, etc. The lower scene recognition rate is more accurate.

本实施例中，所述一种物体位姿测量的装置包括存储器和处理器，所述存储器用于存储至少一个程序，所述处理器用于加载所述至少一个程序以执行所述一种物体位姿测量的方法。In this embodiment, the device for measuring the pose of an object includes a memory and a processor, the memory is used to store at least one program, and the processor is used to load the at least one program to execute the object position method of attitude measurement.

所述存储器还可以单独生产出来，并用于存储与所述一种物体位姿测量的方法相应的计算机程序。当这个存储器与处理器连接时，其存储的计算机程序将被处理器读取出来并执行，从而实施所述一种物体位姿测量的方法，达到实施例中所述的技术效果。The memory can also be produced separately and used to store a computer program corresponding to the method for measuring the pose of an object. When the memory is connected to the processor, the computer program stored in it will be read and executed by the processor, so as to implement the method for measuring the pose of an object and achieve the technical effect described in the embodiment.

需要说明的是，如无特殊说明，当某一特征被称为“固定”、“连接”在另一个特征，它可以直接固定、连接在另一个特征上，也可以间接地固定、连接在另一个特征上。此外，本公开中所使用的上、下、左、右等描述仅仅是相对于附图中本公开各组成部分的相互位置关系来说的。在本公开中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式，除非上下文清楚地表示其他含义。此外，除非另有定义，本实施例所使用的所有的技术和科学术语与本技术领域的技术人员通常理解的含义相同。本实施例说明书中所使用的术语只是为了描述具体的实施例，而不是为了限制本发明。本实施例所使用的术语“和/或”包括一个或多个相关的所列项目的任意的组合。It should be noted that, unless otherwise specified, when a feature is called "fixed" or "connected" to another feature, it can be directly fixed and connected to another feature, or indirectly fixed and connected to another feature. on a feature. In addition, descriptions such as up, down, left, and right used in the present disclosure are only relative to the mutual positional relationship of the components of the present disclosure in the drawings. As used in this disclosure, the singular forms "a", "the", and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. In addition, unless otherwise defined, all technical and scientific terms used in this embodiment have the same meaning as commonly understood by those skilled in the art. The terms used in the description of this embodiment are only for describing specific embodiments, not for limiting the present invention. The term "and/or" used in this embodiment includes any combination of one or more related listed items.

应当理解，尽管在本公开可能采用术语第一、第二、第三等来描述各种元件，但这些元件不应限于这些术语。这些术语仅用来将同一类型的元件彼此区分开。例如，在不脱离本公开范围的情况下，第一元件也可以被称为第二元件，类似地，第二元件也可以被称为第一元件。本实施例所提供的任何以及所有实例或示例性语言(“例如”、“如”等)的使用仅意图更好地说明本发明的实施例，并且除非另外要求，否则不会对本发明的范围施加限制。It should be understood that although the terms first, second, third etc. may be used in the present disclosure to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish elements of the same type from one another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present disclosure. The use of any and all examples, or exemplary language ("such as", "such as", etc.) provided in the examples is intended merely to better illuminate the examples of the invention and will not cast a shadow on the scope of the invention unless otherwise claimed impose restrictions.

应当认识到，本发明的实施例可以由计算机硬件、硬件和软件的组合、或者通过存储在非暂时性计算机可读存储器中的计算机指令来实现或实施。所述方法可以使用标准编程技术-包括配置有计算机程序的非暂时性计算机可读存储介质在计算机程序中实现，其中如此配置的存储介质使得计算机以特定和预定义的方式操作——根据在具体实施例中描述的方法和附图。每个程序可以以高级过程或面向对象的编程语言来实现以与计算机系统通信。然而，若需要，该程序可以以汇编或机器语言实现。在任何情况下，该语言可以是编译或解释的语言。此外，为此目的该程序能够在编程的专用集成电路上运行。It should be appreciated that embodiments of the invention may be realized or implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The methods can be implemented in a computer program using standard programming techniques - including a non-transitory computer-readable storage medium configured with a computer program, where the storage medium so configured causes the computer to operate in a specific and predefined manner - according to the specific Methods and Figures described in the Examples. Each program can be implemented in a high-level procedural or object-oriented programming language to communicate with the computer system. However, the programs can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on an application specific integrated circuit programmed for this purpose.

此外，可按任何合适的顺序来执行本实施例描述的过程的操作，除非本实施例另外指示或以其他方式明显地与上下文矛盾。本实施例描述的过程(或变型和/或其组合)可在配置有可执行指令的一个或多个计算机系统的控制下执行，并且可作为共同地在一个或多个处理器上执行的代码(例如，可执行指令、一个或多个计算机程序或一个或多个应用)、由硬件或其组合来实现。所述计算机程序包括可由一个或多个处理器执行的多个指令。Furthermore, operations of processes described in this embodiment may be performed in any suitable order unless otherwise indicated by this embodiment or otherwise clearly contradicted by context. The processes described in this embodiment (or variants and/or combinations thereof) can be executed under the control of one or more computer systems configured with executable instructions, and can be executed as code jointly executed on one or more processors (eg, executable instructions, one or more computer programs, or one or more applications), hardware or a combination thereof. The computer program comprises a plurality of instructions executable by one or more processors.

进一步，所述方法可以在可操作地连接至合适的任何类型的计算平台中实现，包括但不限于个人电脑、迷你计算机、主框架、工作站、网络或分布式计算环境、单独的或集成的计算机平台、或者与带电粒子工具或其它成像装置通信等等。本发明的各方面可以以存储在非暂时性存储介质或设备上的机器可读代码来实现，无论是可移动的还是集成至计算平台，如硬盘、光学读取和/或写入存储介质、RAM、ROM等，使得其可由可编程计算机读取，当存储介质或设备由计算机读取时可用于配置和操作计算机以执行在此所描述的过程。此外，机器可读代码，或其部分可以通过有线或无线网络传输。当此类媒体包括结合微处理器或其他数据处理器实现上文所述步骤的指令或程序时，本实施例所述的发明包括这些和其他不同类型的非暂时性计算机可读存储介质。当根据本发明所述的方法和技术编程时，本发明还包括计算机本身。Further, the method can be implemented in any type of computing platform operably connected to a suitable one, including but not limited to personal computer, minicomputer, main frame, workstation, network or distributed computing environment, stand-alone or integrated computer platform, or communicate with charged particle tools or other imaging devices, etc. Aspects of the invention can be implemented as machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or written storage medium, RAM, ROM, etc., such that they are readable by a programmable computer, when the storage medium or device is read by the computer, can be used to configure and operate the computer to perform the processes described herein. Additionally, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described in this embodiment includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs for implementing the steps described above in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques described herein.

计算机程序能够应用于输入数据以执行本实施例所述的功能，从而转换输入数据以生成存储至非易失性存储器的输出数据。输出信息还可以应用于一个或多个输出设备如显示器。在本发明优选的实施例中，转换的数据表示物理和有形的对象，包括显示器上产生的物理和有形对象的特定视觉描绘。Computer programs can be applied to input data to perform the functions described in this embodiment, thereby transforming the input data to generate output data stored to non-volatile memory. Output information may also be applied to one or more output devices such as a display. In a preferred embodiment of the invention, the transformed data represents physical and tangible objects, including specific visual depictions of physical and tangible objects produced on a display.

以上所述，只是本发明的较佳实施例而已，本发明并不局限于上述实施方式，只要其以相同的手段达到本发明的技术效果，凡在本发明的精神和原则之内，所做的任何修改、等同替换、改进等，均应包含在本发明保护的范围之内。在本发明的保护范围内其技术方案和/或实施方式可以有各种不同的修改和变化。The above is only a preferred embodiment of the present invention, and the present invention is not limited to the above-mentioned implementation, as long as it achieves the technical effect of the present invention by the same means, within the spirit and principles of the present invention, any Any modification, equivalent replacement, improvement, etc., shall be included within the protection scope of the present invention. Various modifications and changes may be made to the technical solutions and/or implementations within the protection scope of the present invention.

Claims

1. An object pose measurement method is characterized by comprising an off-line modeling stage and an on-line matching stage;

the offline modeling phase comprises:

inputting a three-dimensional model of an object, wherein the three-dimensional model comprises model point cloud coordinates and a model point cloud normal vector;

sampling the model point cloud coordinates and the model point cloud normal vector;

constructing a feature set in the model point cloud coordinates and the model point cloud normal vectors obtained by sampling, and calculating the model point pair features;

storing the extracted model point pair characteristics into a hash table;

the online matching stage comprises:

inputting a depth image, calculating scene point cloud coordinates corresponding to each pixel point of the depth image according to camera internal parameters, and calculating a scene point cloud normal vector according to the point cloud coordinates;

sampling the scene point cloud coordinates and the scene point cloud normal vector;

constructing a feature set in the scene point cloud coordinates and the scene point cloud normal vectors obtained by sampling, and calculating scene point pair features;

quantizing the extracted scene point pair features and matching the scene point pair features with the model point pair features stored in a hash table to obtain a plurality of candidate object poses;

inputting a color image and extracting a scene edge point cloud;

screening the candidate object poses according to the scene edge point cloud;

clustering candidate object poses obtained by screening to obtain a plurality of preliminary object poses;

registering the initial object pose by using an iterative closest point algorithm to obtain a final object pose;

the step of quantifying the extracted scene point pair features and matching the quantified scene point pair features with the model point pair features stored in the hash table to obtain a plurality of candidate object poses specifically includes:

quantizing the extracted scene points to obtain features;

expanding the quantization result to compensate for the characteristic offset caused by the noise;

using the expanded result values as key values, and searching model point pair characteristics with the same key values in a hash table;

acquiring a plurality of candidate object poses according to the model point pair characteristics;

the quantization result Q is augmented in the following way:

is provided with

Wherein e _i For quantization error of the ith dimension, F _i Is the value of each dimension of F, Δ is the quantization interval; when i =1, when

When Q is _new ＝(Q ₁ -1,Q ₂ ,Q ₃ ,Q ₄ ) When is coming into contact with

When is, Q _new ＝(Q ₁ +1,Q ₂ ,Q ₃ ,Q ₄ ) When is coming into contact with

When the Q is not expanded, the Q is not expanded; wherein Q is _new The quantized result after the expansion is obtained.

2. The method for measuring the pose of an object according to claim 1, wherein the step of sampling the coordinates of the model point cloud and the normal vector of the model point cloud specifically comprises:

calculating a boundary frame surrounding the model point cloud according to the model point cloud coordinates to obtain a model point cloud space;

rasterizing the model point cloud space to obtain a plurality of grids with equal sizes, wherein each grid comprises a plurality of point clouds, and each point cloud comprises a corresponding model point cloud coordinate and a model point cloud normal vector;

clustering point clouds contained in each grid according to the size of an included angle between normal vectors of the model point clouds;

and averaging the model point cloud coordinates and the model point cloud normal vectors in each cluster to obtain the model point cloud coordinates and the model point cloud normal vectors after each grid sampling.

3. The method for measuring the pose of an object according to claim 1, wherein the step of constructing a feature set in the model point cloud coordinates and the model point cloud normal vectors obtained by sampling and calculating the model point pair features specifically comprises:

constructing a K-D tree for the sampled model point cloud coordinates;

selecting a reference point, wherein the reference point is any point in a model point cloud coordinate obtained by sampling;

searching a target point in the K-D tree, wherein the distance between the target point and the reference point is less than a first threshold value;

and sequentially calculating the model point pair characteristics formed by the reference points and the target points.

4. The object pose measurement method according to claim 1, wherein the step of storing the extracted model point pair features in a hash table specifically comprises:

quantizing the extracted model points to the features;

solving a key value of the quantization result through a hash function, and taking the key value as an index of the point pair characteristics in a hash table;

point pair characteristics with the same index are stored in the same bucket of the hash table, and point pair characteristics with different indexes are stored in different buckets of the hash table.

5. The object pose measurement method according to claim 1, wherein the step of extracting a scene edge point cloud from the input color image specifically comprises:

graying the color image;

performing edge detection on the grayed image by using an edge detector;

corresponding pixels at the edge of the image to the depth image one by one, and calculating the spatial coordinates of pixel points according to camera internal parameters;

and extracting the space coordinates to be used as a scene edge point cloud.

6. The method for measuring the pose of an object according to claim 1, wherein the step of screening the candidate object poses according to the scene edge point cloud specifically comprises:

projecting the object three-dimensional model corresponding to the candidate object pose to an imaging plane according to camera internal parameters to obtain an edge point cloud of the three-dimensional model;

selecting any point from the edge point cloud of the three-dimensional model as a reference point, and finding out a corresponding matching point from the scene edge point cloud, wherein the matching point is the point closest to the reference point;

calculating a first distance, wherein the first distance is the distance from a matching point to a reference point, if the first distance is smaller than a second threshold value, the matching point is reserved, otherwise, the matching point is discarded;

and calculating the proportion of the number of the reserved matching points to the total number of the points in the edge point cloud of the three-dimensional model, if the proportion is greater than a third threshold value, reserving the candidate object pose corresponding to the three-dimensional model, and otherwise, discarding the candidate object pose.

7. The object pose measurement method according to claim 1, wherein the step of clustering candidate object poses obtained by screening to obtain a plurality of preliminary object poses specifically comprises:

selecting any one of the candidate object poses obtained by screening as a first candidate object pose;

respectively calculating the distance between the first candidate object pose and the other candidate object-free poses obtained by screening;

respectively initializing the candidate object poses obtained by screening into clusters with corresponding numbers;

clustering candidate object poses obtained by screening according to a hierarchical clustering method;

and extracting candidate object poses with the highest voting scores in each cluster to obtain a plurality of preliminary object poses.

8. An apparatus for object pose measurement, comprising a memory for storing at least one program and a processor for loading the at least one program to perform the method of any one of claims 1-7.

9. A storage medium having stored therein processor-executable instructions, which when executed by a processor, are for performing the method of any one of claims 1-7.