CN114972495A

CN114972495A - Grabbing method and device for object with pure plane structure and computing equipment

Info

Publication number: CN114972495A
Application number: CN202110217389.5A
Authority: CN
Inventors: 魏海永; 盛文波; 刘迪一; 丁有爽; 邵天兰
Original assignee: Mech Mind Robotics Technologies Co Ltd
Current assignee: Mech Mind Robotics Technologies Co Ltd
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2022-08-30

Abstract

The invention discloses a method, device and computing device for grasping objects with pure plane structure, wherein the method includes: acquiring point clouds corresponding to multiple objects in a current scene; for each object, corresponding to the object Extract the edge of the point cloud, get the edge point cloud corresponding to the object, and calculate the tangent vector of each 3D point in the edge point cloud; for any two 3D points in the edge point cloud, construct a point containing two 3D points Yes, and according to the tangent vector of the two 3D points, the point-to-point feature vector of the point pair is generated; according to the point-to-point feature vector of each point pair of the object, the edge point cloud corresponding to the object is matched with the preset template point cloud , get the pose information of the object. The scheme generates point-to-point feature vectors based on the tangent vector of 3D points in the edge point cloud, and matches the edge point cloud with the preset template point cloud according to the point-to-point feature vector, which realizes the rapid detection of the pose information of the object. , Accurate identification.

Description

Grabbing method, device and computing device for objects with pure plane structure

技术领域technical field

本发明涉及计算机技术领域，具体涉及一种针对纯平面结构的物体的抓取方法、装置及计算设备。The present invention relates to the technical field of computers, and in particular to a method, device and computing device for grasping objects with a purely planar structure.

背景技术Background technique

随着工业智能化的发展，通过机器人代替人工对物体(例如工业零件、箱体等)进行操作的情况越来越普及。在机器人操作时，一般需要抓取物体，将物体从一个位置移动并放置至另一位置处，例如从传送带上抓取物体移动并放置于托盘上或笼车中，又如从托盘上抓取物体，按要求放置于传送带或其他托盘上等。然而，现有技术中对于待抓取的物体的位姿信息的识别不够精准，且识别效率较低，很难满足高速的工业自动化需求。With the development of industrial intelligence, the operation of objects (such as industrial parts, boxes, etc.) by robots instead of humans is becoming more and more popular. When the robot is operating, it is generally necessary to grab the object, move the object from one position and place it in another position, such as grabbing the object from a conveyor belt and placing it on a pallet or a cage, or grabbing from a pallet Objects, placed on conveyor belts or other pallets, etc. as required. However, in the prior art, the identification of the pose information of the object to be grasped is not accurate enough, and the identification efficiency is low, so it is difficult to meet the needs of high-speed industrial automation.

发明内容SUMMARY OF THE INVENTION

鉴于上述问题，提出了本发明以便提供一种克服上述问题或者至少部分地解决上述问题的针对纯平面结构的物体的抓取方法、装置及计算设备。In view of the above problems, the present invention is proposed to provide a method, device and computing device for grasping objects with a purely planar structure that overcome the above problems or at least partially solve the above problems.

根据本发明的一个方面，提供了一种针对纯平面结构的物体的抓取方法，该方法包括：According to an aspect of the present invention, there is provided a method for grasping objects with a purely planar structure, the method comprising:

获取当前场景中的多个物体对应的点云；其中，多个物体具有纯平面结构；Obtain the point clouds corresponding to multiple objects in the current scene; wherein, multiple objects have a purely planar structure;

针对每个物体，对该物体对应的点云进行边缘提取，得到该物体对应的边缘点云，并计算边缘点云中的各个3D点的切线向量；For each object, perform edge extraction on the point cloud corresponding to the object, obtain the edge point cloud corresponding to the object, and calculate the tangent vector of each 3D point in the edge point cloud;

针对边缘点云中任意两个3D点，构建包含有两个3D点的点对，并依据两个3D点的切线向量，生成点对的点对特征向量；For any two 3D points in the edge point cloud, construct a point pair containing two 3D points, and generate the point-to-point feature vector of the point pair according to the tangent vector of the two 3D points;

根据该物体的各个点对的点对特征向量，将该物体对应的边缘点云与预设模板点云进行匹配，得到该物体的位姿信息，以供机器人根据该物体的位姿信息执行抓取操作。According to the point-to-point feature vector of each point pair of the object, the edge point cloud corresponding to the object is matched with the preset template point cloud, and the pose information of the object is obtained, so that the robot can perform grasping according to the pose information of the object. fetch operation.

根据本发明的另一方面，提供了一种针对纯平面结构的物体的抓取装置，该装置包括：According to another aspect of the present invention, there is provided a grasping device for objects with a purely planar structure, the device comprising:

第一获取模块，适于获取当前场景中的多个物体对应的点云；其中，多个物体具有纯平面结构；a first acquisition module, adapted to acquire point clouds corresponding to multiple objects in the current scene, wherein the multiple objects have a pure plane structure;

边缘提取模块，适于针对每个物体，对该物体对应的点云进行边缘提取，得到该物体对应的边缘点云，并计算边缘点云中的各个3D点的切线向量；The edge extraction module is suitable for performing edge extraction on the point cloud corresponding to the object for each object, obtaining the edge point cloud corresponding to the object, and calculating the tangent vector of each 3D point in the edge point cloud;

点对构建模块，适于针对边缘点云中任意两个3D点，构建包含有两个3D点的点对，并依据两个3D点的切线向量，生成点对的点对特征向量；The point pair building module is suitable for constructing a point pair containing two 3D points for any two 3D points in the edge point cloud, and according to the tangent vector of the two 3D points, the point pair feature vector of the point pair is generated;

匹配模块，适于根据该物体的各个点对的点对特征向量，将该物体对应的边缘点云与预设模板点云进行匹配，得到该物体的位姿信息，以供机器人根据该物体的位姿信息执行抓取操作。The matching module is adapted to match the edge point cloud corresponding to the object with the preset template point cloud according to the point-to-point feature vector of each point pair of the object, and obtain the pose information of the object for the robot to use according to the object's The pose information performs the grasping operation.

根据本发明的又一方面，提供了一种计算设备，包括：处理器、存储器、通信接口和通信总线，处理器、存储器和通信接口通过通信总线完成相互间的通信；According to another aspect of the present invention, a computing device is provided, including: a processor, a memory, a communication interface, and a communication bus, and the processor, the memory, and the communication interface communicate with each other through the communication bus;

存储器用于存放至少一可执行指令，可执行指令使处理器执行上述针对纯平面结构的物体的抓取方法对应的操作。The memory is used for storing at least one executable instruction, and the executable instruction enables the processor to perform the operations corresponding to the above-mentioned method for grasping objects with a purely planar structure.

根据本发明的再一方面，提供了一种计算机存储介质，存储介质中存储有至少一可执行指令，可执行指令使处理器执行如上述针对纯平面结构的物体的抓取方法对应的操作。According to yet another aspect of the present invention, a computer storage medium is provided, the storage medium stores at least one executable instruction, and the executable instruction causes the processor to perform the operations corresponding to the above-mentioned method for grasping objects of pure plane structure.

根据本发明提供的技术方案，充分研究了纯平面结构特征，从物体对应的点云中提取出边缘点云，并依据边缘点云中3D点的切线向量来生成点对的点对特征向量，通过点对特征向量实现了对物体的形状结构特征的精准体现；根据点对特征向量将边缘点云与预设模板点云进行匹配，实现了对物体的位姿信息的精准识别，有助于机器人根据物体的位姿信息能够精准、牢固地执行抓取操作，避免出现机器人无法成功抓取物体或者物体被抓取后掉落等抓取失误；并且，在本方案中，仅提取出物体对应的边缘点云参与匹配过程，非边缘点云并未参与匹配过程，从而有效地减少了匹配工作量，提高了物体的位姿信息的识别效率。According to the technical solution provided by the present invention, the pure plane structure feature is fully studied, the edge point cloud is extracted from the point cloud corresponding to the object, and the point-to-point feature vector of the point pair is generated according to the tangent vector of the 3D point in the edge point cloud, The point-to-point feature vector is used to accurately reflect the shape and structure characteristics of the object; the edge point cloud is matched with the preset template point cloud according to the point-to-point feature vector, and the accurate identification of the pose information of the object is realized, which is helpful for The robot can accurately and firmly perform the grasping operation according to the pose information of the object, so as to avoid grasping errors such as the robot failing to grasp the object successfully or the object falling after being grasped; and, in this solution, only the corresponding object is extracted. The edge point cloud of 1 participates in the matching process, and the non-edge point cloud does not participate in the matching process, which effectively reduces the matching workload and improves the recognition efficiency of the pose information of the object.

上述说明仅是本发明技术方案的概述，为了能够更清楚了解本发明的技术手段，而可依照说明书的内容予以实施，并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂，以下特举本发明的具体实施方式。The above description is only an overview of the technical solutions of the present invention, in order to be able to understand the technical means of the present invention more clearly, it can be implemented according to the content of the description, and in order to make the above and other purposes, features and advantages of the present invention more obvious and easy to understand , the following specific embodiments of the present invention are given.

附图说明Description of drawings

通过阅读下文优选实施方式的详细描述，各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的，而并不认为是对本发明的限制。而且在整个附图中，用相同的参考符号表示相同的部件。在附图中：Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are for the purpose of illustrating preferred embodiments only and are not to be considered limiting of the invention. Also, the same components are denoted by the same reference numerals throughout the drawings. In the attached image:

图1示出了根据本发明一个实施例的针对纯平面结构的物体的抓取方法的流程示意图；FIG. 1 shows a schematic flowchart of a method for grasping objects with a purely planar structure according to an embodiment of the present invention;

图2示出了根据本发明另一个实施例的针对纯平面结构的物体的抓取方法的流程示意图；FIG. 2 shows a schematic flowchart of a method for grasping objects with a purely planar structure according to another embodiment of the present invention;

图3示出了根据本发明一个实施例的针对纯平面结构的物体的抓取装置的结构框图；3 shows a structural block diagram of a device for grasping objects with a purely planar structure according to an embodiment of the present invention;

图4示出了根据本发明实施例的一种计算设备的结构示意图。FIG. 4 shows a schematic structural diagram of a computing device according to an embodiment of the present invention.

具体实施方式Detailed ways

下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例，然而应当理解，可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反，提供这些实施例是为了能够更透彻地理解本公开，并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that the present disclosure will be more thoroughly understood, and will fully convey the scope of the present disclosure to those skilled in the art.

图1示出了根据本发明一个实施例的针对纯平面结构的物体的抓取方法的流程示意图，如图1所示，该方法包括如下步骤：FIG. 1 shows a schematic flowchart of a method for grasping objects with a purely planar structure according to an embodiment of the present invention. As shown in FIG. 1 , the method includes the following steps:

步骤S101，获取当前场景中的多个物体对应的点云。Step S101, acquiring point clouds corresponding to multiple objects in the current scene.

当前场景中包含有多个物体，多个物体具有纯平面结构，例如物体可为长方体形状的箱体等。在步骤S101中，可获取预先处理得到的当前场景中的多个物体对应的点云。点云包括各个3D点的位姿信息，各个3D点的位姿信息具体可包括各个3D点在空间的XYZ三轴的坐标值以及各个3D点自身的XYZ三轴方向等信息。在获取了多个物体对应的点云之后，对于当前场景中的每个物体都依次按照步骤S102至步骤S104的方式进行处理。The current scene contains multiple objects, and the multiple objects have a purely planar structure. For example, the objects can be boxes in the shape of a cuboid. In step S101, point clouds corresponding to multiple objects in the current scene obtained by preprocessing may be acquired. The point cloud includes the pose information of each 3D point, and the pose information of each 3D point may specifically include the coordinate value of each 3D point in the XYZ three-axis in space, and the XYZ three-axis direction of each 3D point itself. After the point clouds corresponding to the multiple objects are acquired, each object in the current scene is processed sequentially in the manner of steps S102 to S104.

步骤S102，针对每个物体，对该物体对应的点云进行边缘提取，得到该物体对应的边缘点云，并计算边缘点云中的各个3D点的切线向量。Step S102, for each object, perform edge extraction on the point cloud corresponding to the object, obtain the edge point cloud corresponding to the object, and calculate the tangent vector of each 3D point in the edge point cloud.

其中，针对每个物体，可基于3D或者2D投影等方式，从该物体对应的点云中提取得到该物体对应的边缘点云。发明人在发明过程中通过对具有纯平面结构的物体对应的边缘点云进行仔细分析得知，当不同形状的物体处于平放状态时，其边缘点云中的各个3D点的法向量特征相同，法向方向都是垂直向上的，而其切线向量却存在差异性，那么可利用切线向量来反映物体的形状结构特征。在提取得到该物体对应的边缘点云之后，可利用现有技术中用于计算三维空间中的切线向量的向量计算算法来计算边缘点云中的各个3D点的切线向量，以便依据切线向量生成点对的点对特征向量。本领域技术人员可根据实际需要选择向量计算算法，此处不做具体限定。Wherein, for each object, the edge point cloud corresponding to the object can be obtained by extracting from the point cloud corresponding to the object based on 3D or 2D projection. In the process of invention, the inventor has carefully analyzed the edge point clouds corresponding to objects with a purely planar structure, and found that when objects of different shapes are in a flat state, the normal vector characteristics of each 3D point in the edge point cloud are the same. , the normal direction is vertically upward, but its tangent vector is different, then the tangent vector can be used to reflect the shape and structure characteristics of the object. After the edge point cloud corresponding to the object is extracted, the tangent vector of each 3D point in the edge point cloud can be calculated by using the vector calculation algorithm used to calculate the tangent vector in the three-dimensional space in the prior art, so as to generate the tangent vector according to the tangent vector. Point-to-point eigenvectors for point-to-point pairs. Those skilled in the art can select a vector calculation algorithm according to actual needs, which is not specifically limited here.

步骤S103，针对边缘点云中任意两个3D点，构建包含有两个3D点的点对，并依据两个3D点的切线向量，生成点对的点对特征向量。Step S103, for any two 3D points in the edge point cloud, construct a point pair including the two 3D points, and generate a point pair feature vector of the point pair according to the tangent vector of the two 3D points.

该物体对应的边缘点云包括各个3D点的位姿信息，针对边缘点云中任意两个3D点，构建包含有两个3D点的点对，并且依据这两个3D点的位姿信息和这两个3D点的切线向量生成点对的点对特征向量，从而使得该物体对应有多个点对。具体地，点对特征向量可包括有这两个3D点之间的连线向量的欧式距离、这两个3D点中每个3D点的切线向量与连线向量之间的夹角以及这两个3D点的切线向量之间的夹角。本领域技术人员还可根据实际需要设置点对特征向量包括其他内容，此处不做限定。The edge point cloud corresponding to the object includes the pose information of each 3D point. For any two 3D points in the edge point cloud, a point pair containing two 3D points is constructed, and according to the pose information of the two 3D points and The tangent vector of these two 3D points generates the point-to-point feature vector of the point pair, so that the object corresponds to multiple point pairs. Specifically, the point-to-feature vector may include the Euclidean distance of the line vector between the two 3D points, the angle between the tangent vector of each of the two 3D points and the line vector, and the two 3D points. The angle between the tangent vectors of the 3D points. Those skilled in the art can also set point-to-feature vectors including other contents according to actual needs, which are not limited here.

步骤S104，根据该物体的各个点对的点对特征向量，将该物体对应的边缘点云与预设模板点云进行匹配，得到该物体的位姿信息，以供机器人根据该物体的位姿信息执行抓取操作。Step S104, according to the point-to-point feature vector of each point pair of the object, the edge point cloud corresponding to the object is matched with the preset template point cloud, and the pose information of the object is obtained for the robot according to the pose of the object. information to perform a fetch operation.

为了能够方便、精准地识别场景图像中各个物体的位姿信息，预先构建了包含有多个预设模板点云的模板库，预设模板点云为预先确定的、作为匹配基准的已知物体对应的点云。针对当前场景中的每个物体，根据该物体的各个点对的点对特征向量，将该物体对应的边缘点云与预设模板点云进行匹配，从而得到该物体的位姿信息，其中，物体的位姿信息具体可包括物体中心在空间的XYZ三轴的坐标值以及物体自身的XYZ三轴方向等信息。在得到了物体的位姿信息之后，可将物体的位姿信息传输至机器人，以供机器人根据物体的位姿信息对物体执行抓取操作。In order to easily and accurately identify the pose information of each object in the scene image, a template library containing multiple preset template point clouds is pre-built. The preset template point clouds are pre-determined known objects that serve as matching benchmarks. the corresponding point cloud. For each object in the current scene, according to the point-to-point feature vector of each point pair of the object, the edge point cloud corresponding to the object is matched with the preset template point cloud, so as to obtain the pose information of the object, wherein, The pose information of the object may specifically include information such as the coordinate value of the center of the object in the XYZ three-axis of space, and the XYZ three-axis direction of the object itself. After the pose information of the object is obtained, the pose information of the object can be transmitted to the robot, so that the robot can perform a grasping operation on the object according to the pose information of the object.

根据本实施例提供的针对纯平面结构的物体的抓取方法，充分研究了纯平面结构特征，从物体对应的点云中提取出边缘点云，并依据边缘点云中3D点的切线向量来生成点对的点对特征向量，通过点对特征向量实现了对物体的形状结构特征的精准体现；根据点对特征向量将边缘点云与预设模板点云进行匹配，实现了对物体的位姿信息的精准识别，有助于机器人根据物体的位姿信息能够精准、牢固地执行抓取操作，避免出现机器人无法成功抓取物体或者物体被抓取后掉落等抓取失误；并且，在本方案中，仅提取出物体对应的边缘点云参与匹配过程，非边缘点云并未参与匹配过程，从而有效地减少了匹配工作量，提高了物体的位姿信息的识别效率。According to the grasping method for objects of pure plane structure provided in this embodiment, the characteristics of pure plane structure are fully studied, edge point clouds are extracted from the point cloud corresponding to the object, and the tangent vector of 3D points in the edge point cloud is used to extract the edge point cloud. The point-to-point feature vector of the point-to-point pair is generated, and the precise reflection of the shape and structure characteristics of the object is realized through the point-to-point feature vector; the edge point cloud is matched with the preset template point cloud according to the point-to-point feature vector, and the position of the object is realized. Accurate recognition of posture information helps the robot to accurately and firmly perform grasping operations according to the posture information of the object, avoiding grasping errors such as the robot failing to grasp the object successfully or the object falling after being grasped; In this scheme, only the edge point cloud corresponding to the object is extracted to participate in the matching process, and the non-edge point cloud does not participate in the matching process, thereby effectively reducing the matching workload and improving the recognition efficiency of the pose information of the object.

图2示出了根据本发明另一个实施例的针对纯平面结构的物体的抓取方法的流程示意图，如图2所示，该方法包括如下步骤：FIG. 2 shows a schematic flowchart of a method for grasping objects with a purely planar structure according to another embodiment of the present invention. As shown in FIG. 2 , the method includes the following steps:

步骤S201，获取当前场景的场景图像以及场景图像对应的点云，将场景图像输入至经过训练的深度学习分割模型中进行实例分割处理，得到场景图像中各个物体的分割结果。Step S201: Obtain a scene image of the current scene and a point cloud corresponding to the scene image, input the scene image into a trained deep learning segmentation model for instance segmentation processing, and obtain segmentation results of each object in the scene image.

其中，当前场景中包含有多个具有纯平面结构的物体。可通过设置在上方位置处的相机采集当前场景的场景图像和深度图像，其中，相机具体可为3D相机，3D相机可设置在上方位置处，例如正上方或者斜上方位置处，用于同时采集相机视角内的当前场景的信息，得到场景图像和深度图像，具体地，3D相机可包括有激光探测器、LED等可见光探测器、红外探测器和/或雷达探测器等元件，利用这些元件对当前场景进行探测以得到深度图像。场景图像具体可为RGB图像，场景图像和深度图像的像素点一一对应。通过对场景图像和深度图像进行处理，能够便捷地得到场景图像对应的点云。在步骤S201中，可获取相机采集到的当前场景的场景图像以及通过对场景图像和深度图像进行处理而得到的场景图像对应的点云。Among them, the current scene contains a plurality of objects with pure plane structure. The scene image and depth image of the current scene can be collected by a camera set at an upper position, wherein the camera can be a 3D camera, and the 3D camera can be set at an upper position, such as directly above or obliquely above, for simultaneous acquisition. The information of the current scene within the camera's view angle can be used to obtain scene images and depth images. Specifically, the 3D camera may include components such as laser detectors, visible light detectors such as LEDs, infrared detectors, and/or radar detectors. The current scene is probed to get a depth image. The scene image may specifically be an RGB image, and the pixel points of the scene image and the depth image correspond one-to-one. By processing the scene image and the depth image, the point cloud corresponding to the scene image can be easily obtained. In step S201, a scene image of the current scene collected by the camera and a point cloud corresponding to the scene image obtained by processing the scene image and the depth image may be acquired.

为了能够便捷、精准地分割出场景图像中所包含的每个物体，可预先收集样本场景图像，构建训练样本集合，采用深度学习算法对训练样本集合中的各个样本场景图像进行训练，最终训练得到深度学习分割模型，那么在获取了当前场景的场景图像之后，可将场景图像输入至经过训练的深度学习分割模型中，利用经过训练的深度学习分割模型进行一系列的模型计算，对场景图像中所包含的每个物体进行实例分割处理，从而得到场景图像中各个物体的分割结果。In order to easily and accurately segment each object contained in the scene image, sample scene images can be collected in advance, a training sample set can be constructed, and each sample scene image in the training sample set can be trained by using a deep learning algorithm. In the deep learning segmentation model, after obtaining the scene image of the current scene, the scene image can be input into the trained deep learning segmentation model, and a series of model calculations can be performed by using the trained deep learning segmentation model. Each object included is subjected to instance segmentation processing, so as to obtain the segmentation result of each object in the scene image.

步骤S202，根据场景图像对应的点云以及各个物体的分割结果，确定各个物体对应的点云。Step S202, according to the point cloud corresponding to the scene image and the segmentation result of each object, determine the point cloud corresponding to each object.

其中，各个物体的分割结果可包括各个物体的二值化分割图像，针对每个物体，在该物体的二值化分割图像中可包括该物体所处的物体区域以及除物体区域之外的非物体区域，物体区域具体可用白色区域表示，非物体区域具体可用黑色区域表示。Wherein, the segmentation result of each object may include a binarized segmentation image of each object, and for each object, the binary segmentation image of the object may include the object area where the object is located and the non-object area other than the object area. The object area, the object area can be represented by a white area, and the non-object area can be represented by a black area.

那么针对每个物体，可将场景图像对应的点云投影至该物体的二值化分割图像中，将场景图像对应的点云中投影至二值化分割图像的物体区域中的3D点作为该物体对应的3D点，得到该物体对应的点云。具体地，将场景图像对应的点云中所有3D点都进行投影，若场景图像对应的点云中的某个3D点投影后落入白色的物体区域内，那么认为该3D点属于该物体，即该3D点为该物体对应的3D点，汇总该物体对应的所有3D点，从而得到该物体对应的点云。通过这种处理方式，实现了对物体对应的点云的精准确定。Then for each object, the point cloud corresponding to the scene image can be projected into the binarized segmentation image of the object, and the 3D point in the object region of the object region of the binary segmentation image can be projected from the point cloud corresponding to the scene image as the The 3D point corresponding to the object is obtained, and the point cloud corresponding to the object is obtained. Specifically, all 3D points in the point cloud corresponding to the scene image are projected. If a 3D point in the point cloud corresponding to the scene image falls into the white object area after being projected, then the 3D point is considered to belong to the object. That is, the 3D point is the 3D point corresponding to the object, and all the 3D points corresponding to the object are summarized to obtain the point cloud corresponding to the object. Through this processing method, the accurate determination of the point cloud corresponding to the object is achieved.

步骤S203，获取当前场景中的多个物体对应的点云。Step S203, acquiring point clouds corresponding to multiple objects in the current scene.

步骤S204，针对每个物体，对该物体对应的点云进行边缘提取，得到该物体对应的边缘点云，并计算边缘点云中的各个3D点的切线向量。Step S204, for each object, perform edge extraction on the point cloud corresponding to the object, obtain the edge point cloud corresponding to the object, and calculate the tangent vector of each 3D point in the edge point cloud.

在一种可选的实施方式中，可基于3D方式提取边缘点云。具体地，针对每个物体对应的点云中的每个3D点，在点云中查找位于该3D点的预设邻域区域内的相邻3D点，得到包含有该3D点和相邻3D点的点集合，例如可设置一个邻域半径(如1cm或0.3cm等)，将以该3D点为中心、邻域半径为半径的区域配置为该3D点的预设邻域区域，然后在该物体对应的点云中查找位于该3D点的预设邻域区域内的3D点，将查找到的3D点称为相邻3D点，汇总该3D点和该3D点的所有相邻3D点得到点集合。在得到点集合之后，就可利用现有技术中法向量的计算方式，根据点集合中的点计算得到法向量，并为该法向量设置一个法向量所在平面。In an optional implementation manner, the edge point cloud can be extracted based on a 3D manner. Specifically, for each 3D point in the point cloud corresponding to each object, search for the adjacent 3D point located in the preset neighborhood area of the 3D point in the point cloud, and obtain the 3D point and the adjacent 3D point. A point collection of points, for example, a neighborhood radius (such as 1cm or 0.3cm, etc.) can be set, and the area with the 3D point as the center and the neighborhood radius as the radius is configured as the preset neighborhood area of the 3D point, and then in the Find the 3D point in the preset neighborhood area of the 3D point in the point cloud corresponding to the object, call the found 3D point as the adjacent 3D point, and summarize the 3D point and all adjacent 3D points of the 3D point Get a set of points. After the point set is obtained, the normal vector calculation method in the prior art can be used to obtain the normal vector according to the points in the point set, and a plane where the normal vector is located is set for the normal vector.

接着构建该3D点与点集合中每个相邻3D点之间的连线，并计算各个连线在法向量所在平面中的投影线与指定基准方向轴之间的第一夹角，例如将各个连线投影至法向量所在平面上，得到各个连线对应的投影线，连线和投影线一一对应，计算各个投影线与法向量所在平面中指定基准方向轴(例如X轴等)之间的夹角。在本实施例中，为了便于将该夹角与下文中的边缘点云中任意两个3D点中每个3D点的切线向量与连线向量之间的夹角以及边缘点云中任意两个3D点的切线向量之间的夹角进行区分，将投影线与法向量所在平面中指定基准方向轴之间的夹角称为第一夹角，将3D点的切线向量与连线向量之间的夹角称为第二夹角，将两个3D点的切线向量之间的夹角称为第三夹角。Next, construct the connection line between the 3D point and each adjacent 3D point in the point set, and calculate the first angle between the projection line of each connection line in the plane where the normal vector is located and the specified reference direction axis. Each connection line is projected onto the plane where the normal vector is located, and the projection line corresponding to each connection line is obtained. The connection line and the projection line are in one-to-one correspondence, and the relationship between each projection line and the specified reference direction axis (such as the X axis, etc.) angle between. In this embodiment, in order to facilitate the angle between the angle and the tangent vector and the connection vector of each of the two 3D points in the edge point cloud below, and the angle between any two of the edge point clouds The angle between the tangent vectors of the 3D points is distinguished. The angle between the projection line and the specified reference direction axis in the plane where the normal vector is located is called the first angle, and the angle between the tangent vector of the 3D point and the connection vector The included angle is called the second included angle, and the included angle between the tangent vectors of two 3D points is called the third included angle.

在得到各个投影线对应的第一夹角之后，可根据第一夹角对各个投影线进行排序，例如按照顺时针方向、逆时针方向、第一夹角从大到小的顺序或者第一夹角从小到大的顺序等对各个投影线进行排序，而后计算排序后的相邻两个投影线对应的第一夹角的夹角差，从中选取夹角差的最大值，判断夹角差的最大值是否大于预设角度差阈值；若是，说明该3D点的周围至少有个区域不存在其他3D点，则将该3D点确定为边缘点；若否，说明该3D点的四周都存在其他3D点，则将该3D点确定为非边缘点。针对该物体对应的点云中的每个3D点都进行上述处理，通过将夹角差的最大值与预设角度差阈值进行比较来确定其是否为边缘点，然后汇总该物体的所有边缘点，得到该物体对应的边缘点云。本领域技术人员可根据实际需要对预设角度差阈值进行设置，此处不做具体限定。After the first included angles corresponding to the respective projection lines are obtained, the respective projection lines can be sorted according to the first included angles, for example, in a clockwise direction, a counterclockwise direction, the first included angle in descending order, or the first included angle Sort each projection line in the order of the angles from small to large, and then calculate the angle difference between the first angles corresponding to the two adjacent projection lines after sorting, select the maximum value of the angle difference, and judge the difference between the angles. Whether the maximum value is greater than the preset angle difference threshold; if yes, it means that there is at least one area around the 3D point without other 3D points, then the 3D point is determined as an edge point; if not, it means that there are other 3D points around the 3D point 3D point, the 3D point is determined as a non-edge point. The above processing is performed for each 3D point in the point cloud corresponding to the object, by comparing the maximum value of the angle difference with the preset angle difference threshold to determine whether it is an edge point, and then summarizing all the edge points of the object , get the edge point cloud corresponding to the object. Those skilled in the art can set the preset angle difference threshold according to actual needs, which is not specifically limited here.

在另一种可选的实施方式中，还可基于2D投影方式提取边缘点云。具体地，可将该物体对应的点云中各个3D点投影至预设平面(例如相机所在平面)上，得到投影图像，例如在投影图像中将存在3D点投影的区域用白色标识，将不存在3D点投影的区域用黑色标识。接着可识别投影图像中的轮廓像素点，将轮廓像素点对应的3D点确定为边缘点，汇总该物体的所有边缘点，得到该物体对应的边缘点云。其中，可采用图像识别算法，识别投影图像中所形成的轮廓边界线，将轮廓边界线对应的像素点确定为轮廓像素点，将该物体对应的点云中投影至轮廓像素点上的3D点确定为轮廓像素点对应的3D点，也就是说，若该物体对应的点云中的某个3D点投影后落入轮廓像素点的位置上，那么认为该3D点与该轮廓像素点具有对应关系，该3D点为该轮廓像素点对应的3D点，该3D点即为边缘点。In another optional implementation manner, edge point clouds can also be extracted based on 2D projection. Specifically, each 3D point in the point cloud corresponding to the object can be projected onto a preset plane (such as the plane where the camera is located) to obtain a projected image. Areas where 3D point projections exist are marked in black. Then, the contour pixels in the projected image can be identified, the 3D points corresponding to the contour pixels are determined as edge points, and all the edge points of the object are summarized to obtain the edge point cloud corresponding to the object. Among them, an image recognition algorithm can be used to identify the contour boundary line formed in the projected image, determine the pixel point corresponding to the contour boundary line as the contour pixel point, and project the point cloud corresponding to the object to the 3D point on the contour pixel point. It is determined as the 3D point corresponding to the contour pixel, that is to say, if a 3D point in the point cloud corresponding to the object falls into the position of the contour pixel after being projected, then the 3D point is considered to have a correspondence with the contour pixel. relationship, the 3D point is the 3D point corresponding to the contour pixel point, and the 3D point is the edge point.

通过上述基于3D和2D投影方式，都可便捷地实现对物体对应的边缘点云的精准提取。在提取得到该物体对应的边缘点云之后，即可利用现有技术中用于计算三维空间中的切线向量的向量计算算法来计算边缘点云中的各个3D点的切线向量，以便依据切线向量生成点对的点对特征向量。Through the above-mentioned 3D and 2D projection methods, the precise extraction of the edge point cloud corresponding to the object can be easily achieved. After the edge point cloud corresponding to the object is extracted, the vector calculation algorithm for calculating the tangent vector in the three-dimensional space in the prior art can be used to calculate the tangent vector of each 3D point in the edge point cloud, so that according to the tangent vector Generate point-to-point feature vectors for point pairs.

步骤S205，针对边缘点云中任意两个3D点，构建包含有两个3D点的点对，并依据两个3D点的切线向量，生成点对的点对特征向量。Step S205 , for any two 3D points in the edge point cloud, construct a point pair including the two 3D points, and generate a point pair feature vector of the point pair according to the tangent vector of the two 3D points.

其中，该物体对应的边缘点云包括各个3D点的位姿信息，位姿信息包括3D点在空间的XYZ三轴的坐标值等信息。在步骤S205中，可根据这任意两个3D点在空间的XYZ三轴的坐标值，构建两个3D点之间的连线向量，并计算连线向量的欧式距离；接着计算两个3D点中每个3D点的切线向量与连线向量之间的第二夹角以及两个3D点的切线向量之间的第三夹角；然后利用连线向量的欧式距离、第二夹角以及第三夹角，生成点对的点对特征向量。The edge point cloud corresponding to the object includes the pose information of each 3D point, and the pose information includes information such as the coordinate values of the 3D point in the XYZ three axes of the space. In step S205, a line vector between the two 3D points can be constructed according to the coordinate values of these arbitrary two 3D points in the XYZ three axes of space, and the Euclidean distance of the line vector is calculated; then the two 3D points are calculated The second included angle between the tangent vector of each 3D point and the connecting line vector and the third included angle between the tangent vectors of the two 3D points in Triangulate, generate point-to-point feature vectors for point-to-point pairs.

假设边缘点云中任意两个3D点分别表示为m₁和m₂，m₁的切线向量表示为t₁，m₂的切线向量表示为t₂，m₁和m₂之间的连线向量表示为d，其中，d＝m₂-m₁，那么包含有m₁和m₂的点对的点对特征向量可表示为(||d||₂,∠(t₁,d),∠(t₂,d),∠(t₁,t₂))，其中，||d||₂表示连线向量d的欧式距离，∠(t₁,d)表示m₁的切线向量t₁和连线向量d之间的第二夹角，∠(t₂,d)表示m₂的切线向量t₂和连线向量d之间的第二夹角，∠(t₁,t₂)表示m₁的切线向量t₁和m₂的切线向量t₂之间的第三夹角。Suppose any two 3D points in the edge point cloud are denoted as m ₁ and m ₂ respectively, the tangent vector of m ₁ is denoted as t ₁ , the tangent vector of m ₂ is denoted as t ₂ , the connection vector between m ₁ and m ₂ Denoted as d, where d=m ₂ -m ₁ , then the point-to-point feature vector containing the point pair of m ₁ and m ₂ can be expressed as (||d|| ₂ ,∠(t ₁ ,d),∠ (t ₂ ,d),∠(t ₁ ,t ₂ )), where ||d|| ₂ represents the Euclidean distance of the connection vector d, ∠(t ₁ ,d) represents the tangent vector t _{1 of m 1} _and The second angle between the line vectors d, ∠(t ₂ , d) represents the second angle between the tangent vector t ₂ of m ₂ and the line vector d, ∠(t ₁ , t ₂ ) represents m The third angle between the tangent vector _{t1 of 1} _and the tangent vector _t2 of _m2 .

步骤S206，将该物体对应的边缘点云中各个点对的点对特征向量与预设模板点云的边缘点云中的各个点对的点对特征向量进行第一次匹配，得到第一次匹配结果。Step S206, the point-to-point feature vector of each point pair in the edge point cloud corresponding to the object is matched with the point-to-point feature vector of each point pair in the edge point cloud of the preset template point cloud for the first time to obtain the first time match results.

其中，预设模板点云为预先确定的、作为匹配基准的已知物体对应的点云，预设模板点云的边缘点云为已知物体对应的边缘点云，边缘点云包括各个3D点的位姿信息，针对预设模板点云的边缘点云中任意两个3D点，构建包含有两个3D点的点对，并依据这两个3D点的位姿信息和这两个3D点的切线向量生成点对的点对特征向量，那么使得预设模板点云也同样对应有多个点对。Among them, the preset template point cloud is a predetermined point cloud corresponding to a known object as a matching reference, and the edge point cloud of the preset template point cloud is the edge point cloud corresponding to the known object, and the edge point cloud includes each 3D point. For any two 3D points in the edge point cloud of the preset template point cloud, a point pair containing two 3D points is constructed, and according to the pose information of the two 3D points and the two 3D points The point-to-point feature vector of the point-to-point pair is generated by the tangent vector of , so that the preset template point cloud also corresponds to multiple point-to-point pairs.

通过步骤S205处理得到该物体对应的边缘点云中各个点对的点对特征向量之后，将该物体对应的边缘点云中各个点对的点对特征向量与预设模板点云的边缘点云中的各个点对的点对特征向量进行第一次匹配，得到第一次匹配结果。在各个预设模板点云预先定义了各个3D点的位姿信息，在第一次匹配的过程，需要将各个预设模板点云变换至当前场景中，使得位姿变换后的预设模板点云的边缘点云与当前场景中物体对应的边缘点云尽可能重合，从而得到第一次匹配结果，其中，第一次匹配结果可包括相匹配的预设模板点云的多个位姿变换关系。After the point-to-point feature vector of each point pair in the edge point cloud corresponding to the object is obtained by processing in step S205, the point-to-point feature vector of each point pair in the edge point cloud corresponding to the object and the edge point cloud of the preset template point cloud The first matching is performed on the feature vector of each point pair in , and the first matching result is obtained. The pose information of each 3D point is pre-defined in each preset template point cloud. In the first matching process, each preset template point cloud needs to be transformed into the current scene, so that the preset template point after pose transformation The edge point cloud of the cloud and the edge point cloud corresponding to the object in the current scene overlap as much as possible, so as to obtain the first matching result, wherein the first matching result may include multiple pose transformations of the matching preset template point cloud relation.

步骤S207，将该物体对应的边缘点云中各个3D点的位姿信息与相匹配的预设模板点云的多个位姿变换关系进行第二次匹配，依据第二次匹配结果中匹配分值最高的相匹配的预设模板点云的位姿变换关系确定该物体的位姿信息。In step S207, the pose information of each 3D point in the edge point cloud corresponding to the object is matched with a plurality of pose transformation relationships of the matching preset template point cloud for the second time, according to the matching score in the second matching result. The pose transformation relationship of the matching preset template point cloud with the highest value determines the pose information of the object.

由于第一次匹配结果可包括有相匹配的预设模板点云的多个位姿变换关系，为了能够更为精准地确定物体的位姿信息，在本实施例中，将该物体对应的边缘点云中各个3D点的位姿信息与相匹配的预设模板点云的多个位姿变换关系进行第二次匹配。具体地，可利用预设评价算法，计算该物体对应的边缘点云中各个3D点的位姿信息与相匹配的预设模板点云的多个位姿变换关系之间的匹配分值，得到第二次匹配结果。本领域技术人员可根据实际需要选择预设评价算法，此处不做限定。例如，预设评价算法可为ICP(IterativeClosest Point，迭代最近点)算法、GMM(Gaussian Mixed Model，高斯混合模型)算法等。第二次匹配是对第一次匹配结果的进一步优化和校正，将第二次匹配结果中匹配分值最高的相匹配的预设模板点云的位姿变换关系作为最终匹配对象，依据第二次匹配结果中匹配分值最高的相匹配的预设模板点云的位姿变换关系确定该物体的位姿信息。Since the first matching result may include multiple pose transformation relationships of the matched preset template point cloud, in order to more accurately determine the pose information of the object, in this embodiment, the edge corresponding to the object is The pose information of each 3D point in the point cloud is matched with multiple pose transformation relationships of the matched preset template point cloud for the second time. Specifically, a preset evaluation algorithm can be used to calculate the matching score between the pose information of each 3D point in the edge point cloud corresponding to the object and the multiple pose transformation relationships of the matching preset template point cloud, to obtain Second match result. Those skilled in the art can select a preset evaluation algorithm according to actual needs, which is not limited here. For example, the preset evaluation algorithm may be an ICP (Iterative Closest Point, iterative closest point) algorithm, a GMM (Gaussian Mixed Model, Gaussian Mixed Model) algorithm, and the like. The second matching is the further optimization and correction of the first matching result, and the pose transformation relationship of the matching preset template point cloud with the highest matching score in the second matching result is used as the final matching object. The pose transformation relationship of the matched preset template point cloud with the highest matching score in the secondary matching results determines the pose information of the object.

针对当前场景中的每个物体都按照步骤S204至步骤S207的方式进行处理，得到每个物体的位姿信息。由于物体的位姿信息是在相机坐标系下确定的，为了便于机器人对物体进行定位，需要利用预设转换算法将物体的位姿信息转换至机器人坐标系中，然后将物体转换后的位姿信息传输至机器人，以供机器人根据物体转换后的位姿信息对物体执行抓取操作。Each object in the current scene is processed in the manner from step S204 to step S207 to obtain pose information of each object. Since the pose information of the object is determined in the camera coordinate system, in order to facilitate the robot to locate the object, it is necessary to use a preset transformation algorithm to convert the pose information of the object into the robot coordinate system, and then convert the transformed pose of the object. The information is transmitted to the robot for the robot to perform grasping operations on the object according to the transformed pose information of the object.

根据本实施例提供的针对纯平面结构的物体的抓取方法，利用深度学习分割模型对场景图像进行实例分割，实现了对场景图像中各个物体的精准分割；基于3D或2D投影方式从物体对应的点云中提取出边缘点云，实现了对边缘点云的精准提取，并依据边缘点云中3D点的切线向量来生成点对的点对特征向量，通过点对特征向量实现了对物体的形状结构特征的精准体现；根据点对特征向量，将各个物体对应的点云与预设模板点云进行二次匹配，实现了对物体的位姿信息的精准识别，有助于机器人根据物体的位姿信息能够精准、牢固地执行抓取操作；并且，在本方案中，仅提取出物体对应的边缘点云参与匹配过程，非边缘点云并未参与匹配过程，从而有效地减少了匹配工作量，提高了物体的位姿信息的识别效率。According to the method for grasping objects with a purely planar structure provided in this embodiment, the scene image is segmented by using a deep learning segmentation model, so as to realize the precise segmentation of each object in the scene image; The edge point cloud is extracted from the point cloud, which realizes the precise extraction of the edge point cloud, and generates the point-to-point feature vector according to the tangent vector of the 3D point in the edge point cloud. The accurate reflection of the shape and structure features of the object; according to the point-to-feature vector, the point cloud corresponding to each object is matched with the preset template point cloud twice, which realizes the accurate identification of the pose information of the object, and helps the robot to identify the object according to the object. The pose information can accurately and firmly perform the grasping operation; and, in this scheme, only the edge point cloud corresponding to the object is extracted to participate in the matching process, and the non-edge point cloud does not participate in the matching process, thus effectively reducing the matching process. The workload is improved, and the recognition efficiency of the pose information of the object is improved.

图3示出了根据本发明一个实施例的针对纯平面结构的物体的抓取装置的结构框图，如图3所示，该装置包括：第一获取模块301、边缘提取模块302、点对构建模块303和匹配模块304。Fig. 3 shows a structural block diagram of a device for grasping objects with a purely planar structure according to an embodiment of the present invention. As shown in Fig. 3, the device includes: a first acquisition module 301, an edge extraction module 302, a point pair construction module 303 and matching module 304.

第一获取模块301适于：获取当前场景中的多个物体对应的点云；其中，多个物体具有纯平面结构。The first acquisition module 301 is adapted to: acquire point clouds corresponding to multiple objects in the current scene; wherein the multiple objects have a pure plane structure.

边缘提取模块302适于：针对每个物体，对该物体对应的点云进行边缘提取，得到该物体对应的边缘点云，并计算边缘点云中的各个3D点的切线向量。The edge extraction module 302 is adapted to: for each object, perform edge extraction on the point cloud corresponding to the object, obtain the edge point cloud corresponding to the object, and calculate the tangent vector of each 3D point in the edge point cloud.

点对构建模块303适于：针对边缘点云中任意两个3D点，构建包含有两个3D点的点对，并依据两个3D点的切线向量，生成点对的点对特征向量。The point pair construction module 303 is adapted to: for any two 3D points in the edge point cloud, construct a point pair including the two 3D points, and generate a point pair feature vector of the point pair according to the tangent vector of the two 3D points.

匹配模块304适于：根据该物体的各个点对的点对特征向量，将该物体对应的边缘点云与预设模板点云进行匹配，得到该物体的位姿信息，以供机器人根据该物体的位姿信息执行抓取操作。The matching module 304 is adapted to: according to the point-to-point feature vector of each point pair of the object, match the edge point cloud corresponding to the object with the preset template point cloud, and obtain the pose information of the object for the robot to use according to the object. The pose information to perform the grasping operation.

可选地，该装置还可包括：第二获取模块305、实例分割模块306和物体点云确定模块307。其中，第二获取模块305适于：获取当前场景的场景图像以及场景图像对应的点云。实例分割模块306适于：将场景图像输入至经过训练的深度学习分割模型中进行实例分割处理，得到场景图像中各个物体的分割结果。物体点云确定模块307适于：根据场景图像对应的点云以及各个物体的分割结果，确定各个物体对应的点云。Optionally, the apparatus may further include: a second acquisition module 305 , an instance segmentation module 306 and an object point cloud determination module 307 . The second obtaining module 305 is adapted to: obtain a scene image of the current scene and a point cloud corresponding to the scene image. The instance segmentation module 306 is adapted to: input the scene image into the trained deep learning segmentation model to perform instance segmentation processing, and obtain segmentation results of each object in the scene image. The object point cloud determination module 307 is adapted to: determine the point cloud corresponding to each object according to the point cloud corresponding to the scene image and the segmentation result of each object.

可选地，边缘提取模块302进一步适于：针对点云中的每个3D点，在点云中查找位于该3D点的预设邻域区域内的相邻3D点，得到包含有该3D点和相邻3D点的点集合，并根据点集合计算法向量；构建该3D点与点集合中每个相邻3D点之间的连线，并计算各个连线在法向量所在平面中的投影线与指定基准方向轴之间的第一夹角；根据第一夹角对各个投影线进行排序，并计算排序后的相邻两个投影线对应的第一夹角的夹角差；判断夹角差的最大值是否大于预设角度差阈值；若是，则将该3D点确定为边缘点；若否，则将该3D点确定为非边缘点；汇总所有边缘点，得到该物体对应的边缘点云。Optionally, the edge extraction module 302 is further adapted to: for each 3D point in the point cloud, search for the adjacent 3D point located in the preset neighborhood area of the 3D point in the point cloud, and obtain the 3D point containing the 3D point. and the point set of adjacent 3D points, and calculate the normal vector according to the point set; construct the connection between the 3D point and each adjacent 3D point in the point set, and calculate the projection of each connection on the plane where the normal vector is located The first angle between the line and the specified reference direction axis; sort each projection line according to the first angle, and calculate the angle difference between the first angles corresponding to the two adjacent projection lines after sorting; Whether the maximum value of the angle difference is greater than the preset angle difference threshold; if so, determine the 3D point as an edge point; if not, determine the 3D point as a non-edge point; summarize all edge points to get the edge corresponding to the object point cloud.

可选地，边缘提取模块302进一步适于：将各个连线投影至法向量所在平面上，得到各个连线对应的投影线；计算各个投影线与法向量所在平面中指定基准方向轴之间的第一夹角。Optionally, the edge extraction module 302 is further adapted to: project each connection line on the plane where the normal vector is located to obtain the corresponding projection line of each connection line; calculate the distance between each projection line and the specified reference direction axis in the plane where the normal vector is located first angle.

可选地，边缘提取模块302进一步适于：将该物体对应的点云中各个3D点投影至预设平面上，得到投影图像；识别投影图像中的轮廓像素点，将轮廓像素点对应的3D点确定为边缘点；汇总所有边缘点，得到该物体对应的边缘点云。Optionally, the edge extraction module 302 is further adapted to: project each 3D point in the point cloud corresponding to the object onto a preset plane to obtain a projected image; identify the contour pixels in the projected image, and convert the 3D points corresponding to the contour pixels to the projected image. The point is determined as an edge point; all edge points are aggregated to obtain the edge point cloud corresponding to the object.

可选地，边缘提取模块302进一步适于：识别投影图像中所形成的轮廓边界线，将轮廓边界线对应的像素点确定为轮廓像素点，将该物体对应的点云中投影至轮廓像素点的3D点确定为轮廓像素点对应的3D点。Optionally, the edge extraction module 302 is further adapted to: identify the outline boundary line formed in the projected image, determine the pixel point corresponding to the outline boundary line as the outline pixel point, and project the point cloud corresponding to the object to the outline pixel point. The 3D point of is determined as the 3D point corresponding to the contour pixel point.

可选地，点对构建模块303进一步适于：构建两个3D点之间的连线向量，并计算连线向量的欧式距离；计算两个3D点中每个3D点的切线向量与连线向量之间的第二夹角以及两个3D点的切线向量之间的第三夹角；利用连线向量的欧式距离、第二夹角以及第三夹角，生成点对的点对特征向量。Optionally, the point pair construction module 303 is further adapted to: construct a line vector between two 3D points, and calculate the Euclidean distance of the line vector; calculate the tangent vector and the connection line of each of the two 3D points. The second included angle between vectors and the third included angle between the tangent vectors of two 3D points; the point-to-point feature vector of the point pair is generated using the Euclidean distance, second included angle, and third included angle of the connection vector .

可选地，匹配模块304进一步适于：将该物体对应的边缘点云中各个点对的点对特征向量与预设模板点云的边缘点云中的各个点对的点对特征向量进行第一次匹配，得到第一次匹配结果；其中，第一次匹配结果包括相匹配的预设模板点云的多个位姿变换关系；将该物体对应的边缘点云中各个3D点的位姿信息与相匹配的预设模板点云的多个位姿变换关系进行第二次匹配，依据第二次匹配结果中匹配分值最高的相匹配的预设模板点云的位姿变换关系确定该物体的位姿信息。Optionally, the matching module 304 is further adapted to: perform the first step between the point-to-point feature vector of each point pair in the edge point cloud corresponding to the object and the point-to-point feature vector of each point pair in the edge point cloud of the preset template point cloud. One matching, the first matching result is obtained; wherein, the first matching result includes multiple pose transformation relationships of the matched preset template point cloud; the pose of each 3D point in the edge point cloud corresponding to the object The information is matched with multiple pose transformation relationships of the matched preset template point cloud for the second time, and the pose transformation relationship of the matching preset template point cloud with the highest matching score in the second matching result is determined. The pose information of the object.

可选地，匹配模块304进一步适于：利用预设评价算法，计算该物体对应的边缘点云中各个3D点的位姿信息与相匹配的预设模板点云的多个位姿变换关系之间的匹配分值。Optionally, the matching module 304 is further adapted to: use a preset evaluation algorithm to calculate the relationship between the pose information of each 3D point in the edge point cloud corresponding to the object and the multiple pose transformation relationships of the matching preset template point cloud. matching score between.

根据本实施例提供的针对纯平面结构的物体的抓取装置，利用深度学习分割模型对场景图像进行实例分割，实现了对场景图像中各个物体的精准分割；基于3D或2D投影方式从物体对应的点云中提取出边缘点云，实现了对边缘点云的精准提取，并依据边缘点云中3D点的切线向量来生成点对的点对特征向量，通过点对特征向量实现了对物体的形状结构特征的精准体现；根据点对特征向量，将各个物体对应的点云与预设模板点云进行二次匹配，实现了对物体的位姿信息的精准识别，有助于机器人根据物体的位姿信息能够精准、牢固地执行抓取操作；并且，在本方案中，仅提取出物体对应的边缘点云参与匹配过程，非边缘点云并未参与匹配过程，从而有效地减少了匹配工作量，提高了物体的位姿信息的识别效率。According to the grasping device for objects of pure plane structure provided in this embodiment, the scene image is segmented by using the deep learning segmentation model, and the precise segmentation of each object in the scene image is realized; The edge point cloud is extracted from the point cloud, which realizes the precise extraction of the edge point cloud, and generates the point-to-point feature vector according to the tangent vector of the 3D point in the edge point cloud. The accurate reflection of the shape and structure features of the object; according to the point-to-feature vector, the point cloud corresponding to each object is matched with the preset template point cloud twice, which realizes the accurate identification of the pose information of the object, and helps the robot to identify the object according to the object. The pose information can accurately and firmly perform the grasping operation; and, in this scheme, only the edge point cloud corresponding to the object is extracted to participate in the matching process, and the non-edge point cloud does not participate in the matching process, thus effectively reducing the matching process. The workload is improved, and the recognition efficiency of the pose information of the object is improved.

本发明还提供了一种非易失性计算机存储介质，计算机存储介质存储有至少一可执行指令，可执行指令可执行上述任意方法实施例中的针对纯平面结构的物体的抓取方法。The present invention also provides a non-volatile computer storage medium, where the computer storage medium stores at least one executable instruction, and the executable instruction can execute any of the above method embodiments for grabbing a purely planar object.

图4示出了根据本发明实施例的一种计算设备的结构示意图，本发明具体实施例并不对计算设备的具体实现做限定。FIG. 4 shows a schematic structural diagram of a computing device according to an embodiment of the present invention. The specific embodiment of the present invention does not limit the specific implementation of the computing device.

如图4所示，该计算设备可以包括：处理器(processor)402、通信接口(Communications Interface)404、存储器(memory)406、以及通信总线408。As shown in FIG. 4 , the computing device may include: a processor (processor) 402 , a communications interface (Communications Interface) 404 , a memory (memory) 406 , and a communication bus 408 .

其中：in:

处理器402、通信接口404、以及存储器406通过通信总线408完成相互间的通信。The processor 402 , the communication interface 404 , and the memory 406 communicate with each other through the communication bus 408 .

通信接口404，用于与其它设备比如客户端或其它服务器等的网元通信。The communication interface 404 is used for communicating with network elements of other devices such as clients or other servers.

处理器402，用于执行程序410，具体可以执行上述针对纯平面结构的物体的抓取方法实施例中的相关步骤。The processor 402 is configured to execute the program 410, and specifically, may execute the relevant steps in the above embodiments of the method for grasping objects with a purely planar structure.

具体地，程序410可以包括程序代码，该程序代码包括计算机操作指令。Specifically, the program 410 may include program code including computer operation instructions.

处理器402可能是中央处理器CPU，或者是特定集成电路ASIC(ApplicationSpecific Integrated Circuit)，或者是被配置成实施本发明实施例的一个或多个集成电路。计算设备包括的一个或多个处理器，可以是同一类型的处理器，如一个或多个CPU；也可以是不同类型的处理器，如一个或多个CPU以及一个或多个ASIC。The processor 402 may be a central processing unit (CPU), or an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included in the computing device may be the same type of processors, such as one or more CPUs; or may be different types of processors, such as one or more CPUs and one or more ASICs.

存储器406，用于存放程序410。存储器406可能包含高速RAM存储器，也可能还包括非易失性存储器(non-volatile memory)，例如至少一个磁盘存储器。The memory 406 is used to store the program 410 . Memory 406 may include high-speed RAM memory, and may also include non-volatile memory, such as at least one disk memory.

程序410具体可以用于使得处理器402执行上述任意方法实施例中的针对纯平面结构的物体的抓取方法。程序410中各步骤的具体实现可以参见上述针对纯平面结构的物体的抓取实施例中的相应步骤和单元中对应的描述，在此不赘述。所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的设备和模块的具体工作过程，可以参考前述方法实施例中的对应过程描述，在此不再赘述。The program 410 can specifically be used to cause the processor 402 to execute the method for grasping objects with a purely planar structure in any of the foregoing method embodiments. For the specific implementation of the steps in the program 410, reference may be made to the corresponding descriptions in the corresponding steps and units in the above embodiment for grasping objects with a purely planar structure, which will not be repeated here. Those skilled in the art can clearly understand that, for the convenience and brevity of description, for the specific working process of the above-described devices and modules, reference may be made to the corresponding process descriptions in the foregoing method embodiments, which will not be repeated here.

在此提供的算法和显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述，构造这类系统所要求的结构是显而易见的。此外，本发明也不针对任何特定编程语言。应当明白，可以利用各种编程语言实现在此描述的本发明的内容，并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。The algorithms and displays provided herein are not inherently related to any particular computer, virtual system, or other device. Various general-purpose systems can also be used with teaching based on this. The structure required to construct such a system is apparent from the above description. Furthermore, the present invention is not directed to any particular programming language. It is to be understood that various programming languages may be used to implement the inventions described herein, and that the descriptions of specific languages above are intended to disclose the best mode for carrying out the invention.

在此处所提供的说明书中，说明了大量具体细节。然而，能够理解，本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中，并未详细示出公知的方法、结构和技术，以便不模糊对本说明书的理解。In the description provided herein, numerous specific details are set forth. It will be understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

类似地，应当理解，为了精简本公开并帮助理解各个发明方面中的一个或多个，在上面对本发明的示例性实施例的描述中，本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而，并不应将该公开的方法解释成反映如下意图：即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说，如权利要求书所反映的那样，发明方面在于少于前面公开的单个实施例的所有特征。因此，遵循具体实施方式的权利要求书由此明确地并入该具体实施方式，其中每个权利要求本身都作为本发明的单独实施例。Similarly, it is to be understood that in the above description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together into a single embodiment, figure, or its description. This disclosure, however, should not be construed as reflecting an intention that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.

本领域那些技术人员可以理解，可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件，以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外，可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述，本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art will understand that the modules in the device in the embodiment can be adaptively changed and arranged in one or more devices different from the embodiment. The modules or units or components in the embodiments may be combined into one module or unit or component, and further they may be divided into multiple sub-modules or sub-units or sub-assemblies. All features disclosed in this specification (including accompanying claims, abstract and drawings) and any method so disclosed may be employed in any combination, unless at least some of such features and/or procedures or elements are mutually exclusive. All processes or units of equipment are combined. Each feature disclosed in this specification (including accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

此外，本领域的技术人员能够理解，尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征，但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如，在权利要求书中，所要求保护的实施例的任意之一都可以以任意的组合方式来使用。Furthermore, those skilled in the art will appreciate that although some of the embodiments described herein include certain features, but not others, included in other embodiments, that combinations of features of different embodiments are intended to be within the scope of the invention within and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.

本发明的各个部件实施例可以以硬件实现，或者以在一个或者多个处理器上运行的软件模块实现，或者以它们的组合实现。本领域的技术人员应当理解，可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如，计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上，或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到，或者在载体信号上提供，或者以任何其他形式提供。Various component embodiments of the present invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components according to the embodiments of the present invention. The present invention can also be implemented as apparatus or apparatus programs (eg, computer programs and computer program products) for performing part or all of the methods described herein. Such a program implementing the present invention may be stored on a computer-readable medium, or may be in the form of one or more signals. Such signals may be downloaded from Internet sites, or provided on carrier signals, or in any other form.

应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制，并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中，不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中，这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。It should be noted that the above-described embodiments illustrate rather than limit the invention, and that alternative embodiments may be devised by those skilled in the art without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, and third, etc. do not denote any order. These words can be interpreted as names.

Claims

1. A method for grasping objects of pure plane structure, the method comprising:

Acquiring point clouds corresponding to multiple objects in the current scene; wherein the multiple objects have a purely planar structure;

For each object, perform edge extraction on the point cloud corresponding to the object, obtain the edge point cloud corresponding to the object, and calculate the tangent vector of each 3D point in the edge point cloud;

For any two 3D points in the edge point cloud, construct a point pair including the two 3D points, and generate a point-to-point feature vector of the point pair according to the tangent vector of the two 3D points;

According to the point-to-point feature vector of each point pair of the object, the edge point cloud corresponding to the object is matched with the preset template point cloud, and the pose information of the object is obtained, so that the robot can perform grasping according to the pose information of the object. fetch operation.

2. The method according to claim 1, wherein before the acquiring the point clouds corresponding to the multiple objects in the current scene, the method further comprises:

Obtain a scene image of the current scene and a point cloud corresponding to the scene image, input the scene image into a trained deep learning segmentation model to perform instance segmentation processing, and obtain segmentation results of each object in the scene image;

According to the point cloud corresponding to the scene image and the segmentation result of each object, the point cloud corresponding to each object is determined.

3. The method according to claim 1, wherein, for each object, performing edge extraction on the point cloud corresponding to the object, and obtaining the edge point cloud corresponding to the object further comprises:

For each 3D point in the point cloud, search the point cloud for adjacent 3D points located in the preset neighborhood area of the 3D point, and obtain a point set containing the 3D point and the adjacent 3D points , and calculate the normal vector according to the set of points;

Construct the connection line between the 3D point and each adjacent 3D point in the point set, and calculate the first included angle between the projection line of each connection line in the plane where the normal vector is located and the specified reference direction axis ;

Sort each projection line according to the first included angle, and calculate the included angle difference between the first included angles corresponding to the sorted adjacent two projection lines;

Determine whether the maximum value of the included angle difference is greater than a preset angle difference threshold; if so, determine the 3D point as an edge point; if not, determine the 3D point as a non-edge point;

Summarize all edge points to get the edge point cloud corresponding to the object.

4. The method according to claim 3, wherein the calculating the first included angle between the projection line of each connecting line in the plane where the normal vector is located and the specified reference direction axis further comprises:

Projecting each connection line onto the plane where the normal vector is located to obtain a projection line corresponding to each connection line;

Calculate the first included angle between each projection line and the specified reference direction axis in the plane where the normal vector is located.

5. The method according to claim 1, wherein, for each object, performing edge extraction on the point cloud corresponding to the object, and obtaining the edge point cloud corresponding to the object further comprises:

Project each 3D point in the point cloud corresponding to the object onto a preset plane to obtain a projected image;

Identifying contour pixels in the projected image, and determining the 3D points corresponding to the contour pixels as edge points;

6. The method of claim 5, wherein the identifying contour pixels in the projected image further comprises:

Identify the contour boundary line formed in the projection image, determine the pixel point corresponding to the contour boundary line as the contour pixel point, and determine the 3D point projected to the contour pixel point in the point cloud corresponding to the object as the contour pixel point. The 3D point corresponding to the contour pixel point.

7. The method according to any one of claims 1-6, wherein the generating the point-to-point feature vector of the point pair according to the tangent vector of the two 3D points further comprises:

constructing a line vector between the two 3D points, and calculating the Euclidean distance of the line vector;

calculating the second included angle between the tangent vector of each of the two 3D points and the connecting line vector and the third included angle between the tangent vectors of the two 3D points;

Using the Euclidean distance of the connection vector, the second included angle, and the third included angle, a point-to-point feature vector of the point pair is generated.

8. The method according to any one of claims 1-7, wherein, according to the point-to-point feature vector of each point pair of the object, the edge point cloud corresponding to the object is matched with the preset template point cloud, Obtaining the pose information of the object further includes:

The point-to-point feature vector of each point pair in the edge point cloud corresponding to the object is matched with the point-to-point feature vector of each point pair in the edge point cloud of the preset template point cloud for the first time, and the first matching result is obtained; Wherein, the first matching result includes a plurality of pose transformation relationships of the matched preset template point cloud;

Perform a second match between the pose information of each 3D point in the edge point cloud corresponding to the object and the multiple pose transformation relationships of the matching preset template point cloud, and according to the second matching result, the one with the highest matching score. The pose transformation relationship of the matched preset template point cloud determines the pose information of the object.

9 . The method according to claim 8 , wherein the pose information of each 3D point in the edge point cloud corresponding to the object and a plurality of pose transformation relationships of the matching preset template point cloud are carried out secondly. 10 . Secondary matches further include:

Using a preset evaluation algorithm, the matching score between the pose information of each 3D point in the edge point cloud corresponding to the object and the multiple pose transformation relationships of the matching preset template point cloud is calculated.

10. A grasping device for objects with a purely planar structure, the device comprising:

a first acquisition module, adapted to acquire point clouds corresponding to multiple objects in the current scene; wherein the multiple objects have a pure plane structure;

The edge extraction module is adapted to perform edge extraction on the point cloud corresponding to the object for each object, obtain the edge point cloud corresponding to the object, and calculate the tangent vector of each 3D point in the edge point cloud;

A point pair construction module, adapted to construct a point pair including the two 3D points for any two 3D points in the edge point cloud, and generate the point pair according to the tangent vector of the two 3D points The point-to-feature vector of ;

The matching module is adapted to match the edge point cloud corresponding to the object with the preset template point cloud according to the point-to-point feature vector of each point pair of the object, and obtain the pose information of the object for the robot to use according to the object's The pose information performs the grasping operation.

11. The apparatus of claim 10, wherein the apparatus further comprises:

a second acquisition module, adapted to acquire a scene image of the current scene and a point cloud corresponding to the scene image;

an instance segmentation module, adapted to input the scene image into a trained deep learning segmentation model for instance segmentation processing, to obtain segmentation results of each object in the scene image;

The object point cloud determination module is adapted to determine the point cloud corresponding to each object according to the point cloud corresponding to the scene image and the segmentation result of each object.

12. The apparatus of claim 10, wherein the edge extraction module is further adapted to:

13. The apparatus of claim 12, wherein the edge extraction module is further adapted to:

14. The apparatus of claim 10, wherein the edge extraction module is further adapted to:

15. The apparatus of claim 14, wherein the edge extraction module is further adapted to:

16. The apparatus of any of claims 10-15, wherein the peer-to-peer building block is further adapted to:

17. The apparatus of any of claims 10-16, wherein the matching module is further adapted to:

18. The apparatus of claim 17, wherein the matching module is further adapted to:

19. A computing device, comprising: a processor, a memory, a communication interface and a communication bus, the processor, the memory and the communication interface communicate with each other through the communication bus;

The memory is used for storing at least one executable instruction, and the executable instruction causes the processor to perform an operation corresponding to the method for grabbing an object with a purely planar structure according to any one of claims 1-9.

20. A computer storage medium in which at least one executable instruction is stored, the executable instruction causing a processor to execute the method for a purely planar object according to any one of claims 1-9. The operation corresponding to the grab method.