CN117197230A - Method and device for constructing grabbing data set - Google Patents

Method and device for constructing grabbing data set Download PDF

Info

Publication number
CN117197230A
CN117197230A CN202210600292.7A CN202210600292A CN117197230A CN 117197230 A CN117197230 A CN 117197230A CN 202210600292 A CN202210600292 A CN 202210600292A CN 117197230 A CN117197230 A CN 117197230A
Authority
CN
China
Prior art keywords
grabbing
model
point
points
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210600292.7A
Other languages
Chinese (zh)
Inventor
顾启鹏
金鑫
赵夕朦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Cloud Computing Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Cloud Computing Technologies Co Ltd filed Critical Huawei Cloud Computing Technologies Co Ltd
Priority to CN202210600292.7A priority Critical patent/CN117197230A/en
Publication of CN117197230A publication Critical patent/CN117197230A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The embodiment of the application provides a method and a device for constructing a grabbing data set, comprising the following steps: acquiring a 3D model of an object and a model of a grabbing tool; determining a grabbing quality label of grabbing points of the 3D model of the object according to an analysis criterion, wherein the analysis criterion comprises a shape closed analysis criterion, the shape closed analysis criterion is used for evaluating the flatness of a first surface comprising the grabbing points on the 3D model of the object, and the grabbing points have a corresponding relation with the pose of the grabbing tool; and outputting the grabbing data set, wherein the grabbing data set comprises information of the pose of the grabbing tool, grabbing quality labels of grabbing points and point cloud information of the first surface. Thus, the cost for constructing the grabbing data set can be reduced, and the accuracy of grabbing the data set is improved.

Description

Method and device for constructing grabbing data set
Technical Field
The embodiment of the application relates to the field of robots, in particular to a method and a device for constructing a grabbing data set.
Background
The robot grabbing technology is an important technology for robot interaction with the environment, and has important application requirements in tasks such as robot assembly, carrying and sorting. The determination of the grabbing pose is one of key factors for completing the grabbing task of the robot, and the determination of the grabbing pose based on learning becomes a research hotspot in the technical field of grabbing of the robot. Determining the grabbing pose based on the learning method needs to rely on a large-scale grabbing data set. Therefore, the construction of the grabbing data set is the basis for the robot grabbing system to complete the grabbing task.
At present, the workload of capturing and labeling the data set is relatively large, and the autonomous learning efficiency of the robot is reduced. For example, a captured dataset is obtained based on manual annotation of a two-dimensional image. For another example, the capture data set may be obtained by teaching and deep learning, where the capture point cloud data collected by teaching capture poses and the manually labeled labels are used to train the capture pose evaluation neural network, and then the trained network is used to evaluate and score each sample pose, thereby completing the construction of the capture data set. Furthermore, the accuracy of the captured data sets obtained by these means is low and the generalization ability of the models trained from such captured data sets is poor.
Disclosure of Invention
The embodiment of the application provides a method and a device for constructing a grabbing data set, which can reduce the cost for constructing the grabbing data set and improve the accuracy of grabbing the data set.
In a first aspect, a method for constructing a captured data set is provided, which includes: acquiring a 3D model of an object and a model of a grabbing tool; determining a grabbing quality label of grabbing points of a 3D model of the object according to an analysis criterion, wherein the analysis criterion comprises a shape closed analysis criterion, and the shape closed analysis criterion is used for evaluating the flatness of a first surface comprising the grabbing points on the 3D model of the object, and the grabbing points have a corresponding relation with the pose of a grabbing tool; and outputting a grabbing data set, wherein the grabbing data set comprises information of the pose of the grabbing tool, grabbing quality labels of grabbing points and point cloud information of the first surface.
According to the scheme for constructing the grabbing data set, which is provided by the embodiment of the application, when the grabbing data set is constructed, the flatness of the surface of the object is considered, the accuracy of the grabbing data set is improved, and the constructed grabbing data set can be suitable for grabbing scenes of objects with uneven surfaces or pits. In addition, the grabbing quality label can be rapidly determined through the shape seal analysis criterion, so that the efficiency of constructing a data set can be improved, and the cost is reduced.
With reference to the first aspect, in certain implementations of the first aspect, the gripping tool includes a suction cup.
With reference to the first aspect, in certain implementations of the first aspect, the flatness of the first surface is determined according to an angle between normal vectors of points on the first surface.
The scheme for constructing the grabbing data set provided by the embodiment of the application can be used for rapidly and accurately determining the flatness of the first surface by utilizing the included angle between the normal vectors of the points on the first surface.
With reference to the first aspect, in certain implementations of the first aspect, the first surface is a projection of the gripping tool on a 3D model of the object.
With reference to the first aspect, in certain implementations of the first aspect, an edge of the first surface is outside a projected contour of the gripping tool on the 3D model of the object.
According to the scheme for constructing the grabbing data set, when the grabbing tool is in contact with the 3D model of the object, if the surface of the object (the 3D model of the object) corresponding to the edge of the grabbing tool does not have points (point clouds), the flatness of the surface of the object corresponding to the edge of the grabbing tool can be determined through points other than the surface of the object corresponding to the edge of the grabbing tool. In this way, the flatness of the surface of the object covered by the gripping tool can be determined more accurately, and the accuracy and the authenticity of constructing the gripping dataset are improved.
With reference to the first aspect, in certain implementation manners of the first aspect, the analysis criteria further include a grabbing force analysis criterion, where the grabbing force analysis criterion is used to evaluate a difference between a predicted stress condition of the grabbing point and a stress condition of the grabbing point under a preset condition.
The scheme for constructing the grabbing data set provided by the embodiment of the application also considers the influence of grabbing force on grabbing quality. In this way, the accuracy of constructing the captured data set may be improved.
With reference to the first aspect, in certain implementations of the first aspect, q=min { |g·w- Φ||+Ω }, where Q is a difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition, min is the minimum Euclidean distance formula, G is the force and moment of the grabbing point in multiple dimensions, W is the weight of the grabbing point in multiple dimensions, phi is the force and moment of the grabbing point in multiple dimensions under the preset condition, and omega is a regularization term.
According to the scheme for constructing the grabbing data set, regularization items are introduced when the difference between the predicted stress condition of the grabbing points and the stress condition of the grabbing points under the preset condition is obtained. In this way, the accuracy of the obtained stress situation difference can be improved, and the accuracy of constructing the grabbing data set is improved.
With reference to the first aspect, in certain implementations of the first aspect, the 3D model of the object is obtained from the RGB image through three-dimensional reconstruction.
The scheme for constructing the grabbing data set provided by the embodiment of the application can be used for carrying out three-dimensional reconstruction based on the RBG image to obtain the 3D model of the object, so that the three-dimensional reconstruction can be rapidly carried out, and meanwhile, the cost of the three-dimensional reconstruction is reduced.
With reference to the first aspect, in certain implementations of the first aspect, the 3D model of the object is obtained by manual monitoring to delete or correct the 3D model that failed the three-dimensional reconstruction.
According to the scheme for constructing the grabbing data set, provided by the embodiment of the application, manual monitoring can be performed during three-dimensional reconstruction, and the accuracy of three-dimensional reconstruction can be improved.
With reference to the first aspect, in certain implementations of the first aspect, the grabbing quality labels of the grabbing points are obtained after manual monitoring to delete or correct erroneous grabbing quality labels.
According to the scheme for constructing the grabbing data set, manual monitoring can be performed when grabbing quality labels are generated, and accuracy of grabbing quality labels can be improved.
In a second aspect, there is provided an apparatus for constructing a captured data set, comprising: acquiring a 3D model of an object and a model of a grabbing tool; determining a grabbing quality label of grabbing points according to an analysis criterion, wherein the analysis criterion comprises a shape closed analysis criterion, and the shape closed analysis criterion is used for evaluating the flatness of a first surface comprising grabbing points on a 3D model of an object, and the grabbing points have a corresponding relation with the pose of a grabbing tool; and the receiving and transmitting unit is used for outputting a grabbing data set, wherein the grabbing data set comprises information of the pose of the grabbing tool, grabbing quality labels of grabbing points and point cloud information of the first surface.
According to the device for constructing the grabbing data set, which is provided by the embodiment of the application, when the grabbing data set is constructed, the flatness of the surface of the object is considered, the accuracy of the grabbing data set is improved, and the constructed grabbing data set can be suitable for grabbing scenes of objects with uneven surfaces or pits. In addition, the grabbing quality label can be rapidly determined through the shape seal analysis criterion, so that the efficiency of constructing a data set can be improved, and the cost is reduced.
With reference to the second aspect, in certain implementations of the second aspect, the gripping tool includes a suction cup.
With reference to the second aspect, in certain implementations of the second aspect, the flatness of the first surface is determined according to an included angle between normal vectors of points on the first surface.
The device for constructing the grabbing data set provided by the embodiment of the application can be used for rapidly and accurately determining the flatness of the first surface by utilizing the included angle between the normal vectors of the points on the first surface.
With reference to the second aspect, in certain implementations of the second aspect, the first surface is a projection of the gripping tool on a 3D model of the object.
With reference to the second aspect, in certain implementations of the second aspect, an edge of the first surface is outside a projected contour of the gripping tool on the 3D model of the object.
According to the device for constructing the grabbing data set, when the grabbing tool is in contact with the 3D model of the object, if the surface of the object (the 3D model of the object) corresponding to the edge of the grabbing tool does not have points (point clouds), the flatness of the surface of the object corresponding to the edge of the grabbing tool can be determined through the points other than the surface of the object corresponding to the edge of the grabbing tool. In this way, the flatness of the object surface covered by the grabbing tool can be determined more accurately, and the accuracy and the authenticity of constructing the grabbing data set are improved.
With reference to the second aspect, in some implementations of the second aspect, the analysis criteria further include a grabbing force analysis criterion, where the grabbing force analysis criterion is used to evaluate a difference between a predicted stress condition of the grabbing point and a stress condition of the grabbing point under a preset condition.
The device for constructing the grabbing data set provided by the embodiment of the application also considers the influence of grabbing force on grabbing quality. In this way, the accuracy of constructing the captured data set may be improved.
With reference to the second aspect, in certain implementations of the second aspect, q=min { ||g·w- Φ||+Ω }, where Q is the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition, min is the minimum Euclidean distance formula, G is the force and moment of the grabbing point in multiple dimensions, W is the weight of the grabbing point in multiple dimensions, phi is the force and moment of the grabbing point in multiple dimensions under the preset condition, and omega is a regularization term.
The device for constructing the grabbing data set provided by the embodiment of the application introduces a regularization term when the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition is obtained. In this way, the accuracy of the obtained stress situation difference can be improved, and the accuracy of constructing the grabbing data set is improved.
With reference to the second aspect, in certain implementations of the second aspect, the 3D model of the object is obtained from the RGB image through three-dimensional reconstruction.
The scheme for constructing the grabbing data set provided by the embodiment of the application can be used for carrying out three-dimensional reconstruction based on the RBG image to obtain the 3D model of the object, so that the three-dimensional reconstruction can be rapidly carried out, and meanwhile, the cost of the three-dimensional reconstruction is reduced.
With reference to the second aspect, in certain implementations of the second aspect, the 3D model of the object is obtained by manual monitoring to delete or correct the 3D model that failed the three-dimensional reconstruction.
According to the scheme for constructing the grabbing data set, which is provided by the embodiment of the application, manual monitoring is performed during three-dimensional reconstruction, so that the accuracy of three-dimensional reconstruction can be improved.
With reference to the second aspect, in certain implementations of the second aspect, the grabbing quality labels of the grabbing points are obtained after manual monitoring to delete or correct erroneous grabbing quality labels.
According to the scheme for constructing the grabbing data set, manual monitoring can be performed when grabbing quality labels are generated, and accuracy of grabbing quality labels can be improved.
In a third aspect, there is provided a computer readable medium having stored thereon a program code which, when run on a computer, causes the computer to perform the method of any of the first aspects above.
In a fourth aspect, there is provided a chip system comprising: the processor interfaces with the data, and the processor reads instructions stored on the memory through the data interface to perform the method of any of the first aspects.
In a fifth aspect, there is provided a computing device comprising: at least one processor and a memory, the at least one processor coupled with the memory, for reading and executing instructions in the memory to perform the method of any of the first aspects above.
Drawings
Fig. 1 is a schematic diagram of a robotic grasping system 100 according to an embodiment of the application.
Fig. 2 is a schematic diagram of an application scenario of a capture detection module provided in an embodiment of the present application.
Fig. 3 is a schematic architecture diagram of a system for constructing a captured data set according to an embodiment of the present application.
Fig. 4 is a flowchart of a method for constructing a captured data set according to an embodiment of the present application.
Fig. 5 is a flowchart of a method for constructing a captured data set according to an embodiment of the present application.
Fig. 6 is a schematic diagram of manual monitoring according to an embodiment of the present application.
Fig. 7 is an abstract structure diagram of a grab point stress analysis according to an embodiment of the present application.
Fig. 8 is a schematic block diagram of an apparatus for constructing a captured data set according to an embodiment of the present application.
FIG. 9 is a schematic block diagram of a computing device provided by an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 1 is a schematic diagram of a robotic grasping system 100.
The robot gripping system 100 may include a visual imaging unit 110, a gripping detection unit 120, a gripping planning unit 130, and a robot control unit 140.
The visual imaging unit 110 may acquire a normal red, green, and blue (RGB) image, or may also acquire a red, green, and blue-depth (RGB-D) image. The visual imaging unit 110 may convert the RGB image and/or the RGB-D image into a point cloud and generate a point cloud related to only the object by processing the point cloud and scene segmentation. The capture detection unit 120 may determine the capture pose from the point cloud associated with the object. The capture planning unit 130 may perform path planning or capture strategy planning according to the capture pose determined by the capture detection unit 120. The robot control unit 140 controls the robot to move according to the planned path or the gripping strategy to complete the gripping task.
Alternatively, the above components are only an example, and in practical applications, the above components may be added or deleted according to actual needs, and fig. 1 should not be construed as limiting the embodiments of the present application.
The determination of the grabbing pose is one of the key factors in completing the grabbing task of the robot. Currently, determination of a grabbing pose based on learning is called a research hotspot in the technical field of robot grabbing, and most learning-based methods are required to rely on a large-scale grabbing data set for determining the grabbing pose. Thus, the construction of the gripper dataset is a fundamental task of the robotic gripper system.
The workload of capturing and labeling the data set is large, and the autonomous learning efficiency of the robot is reduced. For example, a captured dataset obtained by manual annotation based on a two-dimensional image. In addition, the labels of the grabbing data sets are basically rectangular frames, and the model obtained through training of the grabbing data sets is not suitable for application scenes such as multi-object stacking and shielding.
Of course, the straight line and the corresponding parallel line in the object outline can be detected by carrying out Hough transformation on the object outline information in the detected image information, then a plurality of rectangles are utilized to fit the object outline, the fitted rectangles are used for generating grabbing rectangles suitable for the double-finger paw in an equidistant sampling mode, and finally a grabbing dataset image is generated. The hough transformation can only determine the straight line direction in the detection process, information is easy to lose, and the accuracy of the constructed data set is reduced.
The demonstrator can be used for demonstrating the grabbing gesture on the target object, so that the grabbing part of the point cloud is obtained from the scene point cloud, and meanwhile, the label data are marked by manual scoring. The six degrees of freedom grab pose evaluation network is trained by using the data. And then sampling the grabbing pose according to the scene point cloud acquired by the continuous frames, and evaluating and scoring the pose of each sampling by utilizing a six-degree-of-freedom grabbing pose evaluation network to finish the label generation task of the scene point cloud. According to the method, when the grabbing data set of the training six-degree-of-freedom grabbing pose evaluation network is obtained, manual intervention is needed, and labor cost is increased. In addition, the mode of constructing the data set does not consider the condition of uneven or wrinkled surface of the object, and the constructed data set has lower precision.
In order to solve the problems, the application provides a construction method of a grabbing data set, which can reduce the cost of constructing the grabbing data set and improve the precision of grabbing the data set.
Fig. 2 is a schematic diagram of an application scenario of a capture detection module according to an embodiment of the present application. The grip detection module in fig. 2 may be applied to the grip detection unit 120 of the robotic grip system 100 of fig. 1.
The application scenario of the capture detection module 260 may be divided into two parts, an offline phase and an online phase. The grip detection model 260 trained during the offline phase may be stored for later processing during the online phase.
The capture detection module 260 in an offline stage may acquire a trained capture detection model. The capture pose generation module 220 may generate candidate capture poses from data in a three-dimensional (3D) model database 210 of the object. For example, the capture pose generation module 220 may determine a capture pose corresponding to a capture point cloud from the capture point cloud in the 3D model of the object. The grab simulation module 230 may calculate a grab quality score for the grab pose of the candidate according to the parsing criteria. The grab ordering module 240 orders and/or labels the grab pose and the corresponding grab quality score to construct the grab data set 250. The grab detection module 260 may train the neural network model based on the grab data set 250 to obtain a trained grab detection model.
The capture detection module 260 in the online stage may accept user input and perform a corresponding operation according to the user input. For example, an RGB image and/or RGB-D map of a scene may be acquired using camera 270 and then point clouds associated with an object may be obtained by scene segmentation/point cloud processing module 280. The point cloud associated with the object is input to the grab detection module 260 to output an optimal grab gesture.
Alternatively, the above components are only an example, and in practical applications, the above components may be added or deleted according to actual needs, and fig. 2 should not be construed as limiting the embodiments of the present application.
Fig. 3 is a schematic architecture diagram of a system for constructing a captured data set according to an embodiment of the present application. The crawling dataset constructed by the system 300 of constructing crawling datasets of fig. 3 may be used for training of the crawling detection module 260 of fig. 2 in the offline phase.
The construction system 300 of the crawling dataset may include a three-dimensional modeling module 310, an analytical criteria module 320, and a manual monitoring module 330.
The three-dimensional modeling module 310 may construct a 3D model of the object.
In some implementations, the three-dimensional modeling module 310 may obtain a 3D model of the object through three-dimensional reconstruction. Wherein the 3D model representation of the three-dimensionally reconstructed object may comprise: depth map (depth), point cloud (point cloud), voxel (voxel) or mesh (mesh).
Illustratively, a three-dimensional reconstruction may be performed from the RGB images, resulting in a 3D model of the object. For example, deep learning may be utilized to build a mapping of a two-dimensional RGB image to its corresponding three-dimensional mesh model. The mapping from the object image to its underlying three-dimensional shape is then learned from a large number of synthetic data, an RGB image is received from any perspective, and a 3D model of the object is output in the form of a three-dimensional grid. For another example, RGB images of multi-view perspectives with a degree of overlap may be acquired and then feature extraction may be performed through a neural network model. The matching cost of the reference image can be constructed by using a plane scanning algorithm, and then the matching cost is accumulated to obtain a cost body. And processing the cost body by using the trained 3D convolutional neural network to obtain predicted depth information. And selecting the correct depth information for prediction by using reconstruction constraint among a plurality of pictures, and reconstructing a 3D model of the object.
In some implementations, the three-dimensional modeling module 310 may model a 3D model of the constructed object through parametric modeling. For example, a three-dimensional CAD model of an object may be built in CAD software based on the material, dimensions, physical parameters, etc. of the object.
The parsing criteria module 320 may obtain a 3D model of the object and a model of the grasping tool, and evaluate grasping quality tags of grasping points of the 3D model of the object according to the parsing criteria. For example, the parsing criterion module 320 determines whether the grabbing tool can grab the object under the current pose according to the parsing criterion, and records the current grabbing point and the grabbing quality label corresponding to the current grabbing point. The current grabbing point and the current pose of the grabbing tool have a corresponding relation.
The corresponding relation exists between the current grabbing point and the current pose of the grabbing tool, and when the grabbing tool contacts the 3D model of the object, the normal vector of the current grabbing point is the same as the normal vector of the grabbing tool under the current pose, and the normal vector of the grabbing tool is on the central axis of the grabbing tool.
In some embodiments, the parsing criteria includes shape closure parsing criteria for evaluating flatness of a first surface including grabbing points on a 3D model of the object. The resolution criteria module 320 may determine the quality of grabbing labels for grabbing points based on the form closed resolution criteria.
In some embodiments, the analysis criteria further include a grabbing force analysis criterion for evaluating a difference between a predicted stress condition of the grabbing point and a stress condition of the grabbing point under a preset condition. The resolution criteria module 320 may also determine a grab quality tag for a grab point based on grab force resolution criteria.
The manual monitoring module 330 may monitor two processes, including: a three-dimensional modeling process and a process of generating labels using parsing criteria. The manual monitoring module 330 may present the three-dimensional modeling process and the process of generating tags with parsing criteria to a user for manual monitoring in a visual window or interface. On the one hand, when the three-dimensional object fails to be modeled, the model which fails to be modeled can be deleted or reconstructed through manual monitoring to perform three-dimensional modeling. On the other hand, when the grabbing quality label of the grabbing point is generated based on the parsing criterion module, an erroneous grabbing quality label may be generated. Training of the model directly with such a dataset may result in bias in the model, resulting in poor model accuracy after training. Thus, the generated erroneous grabbing quality labels can be deleted or corrected by manual monitoring.
Alternatively, the above components are only an example, and in practical applications, the above components may be added or deleted according to actual needs, and fig. 3 should not be construed as limiting the embodiments of the present application.
Fig. 4 is a method for constructing a captured data set according to an embodiment of the present application.
S410, acquiring a 3D model of the object and a model of the grabbing tool.
In some embodiments, a 3D model of an object may be obtained by three-dimensional reconstruction. Wherein the 3D model representation of the reconstructed object may comprise: depth maps, point clouds, voxels, or grids.
In some embodiments, a three-dimensional reconstruction may be performed from the RGB images to obtain a 3D model of the object.
For example, deep learning may be utilized to build a mapping of a two-dimensional RGB image to its corresponding three-dimensional mesh model. The mapping from the object image to its underlying three-dimensional shape is then learned from a large number of synthetic data, an RGB image is received from any perspective, and a 3D model of the object is output in the form of a three-dimensional grid.
For example, RGB images of multi-view perspectives with a degree of overlap may be acquired, and then feature extraction may be performed through a neural network model. The matching cost of the reference image can be constructed by using a plane scanning algorithm, and then the matching cost is accumulated to obtain a cost body. And processing the cost body by using the trained 3D convolutional neural network to obtain predicted depth information. And selecting the correct depth information for prediction by using reconstruction constraint among a plurality of pictures, and reconstructing a 3D model of the object.
In some embodiments, a 3D model of an object may be constructed by parametric modeling. For example, a three-dimensional CAD model of an object may be built in CAD software based on the material, dimensions, physical parameters, etc. of the object.
The model of the gripping tool may be provided by the manufacturer of the gripping tool.
In some embodiments, the gripping tool may be a suction cup.
S420, determining grabbing quality labels of grabbing points of the 3D model of the object according to analysis criteria.
And the grabbing points and the pose of the grabbing tool have a corresponding relation. The corresponding relation exists between the grabbing points and the pose of the grabbing tool, and the grabbing points and the grabbing tool can be understood to have the same normal vector as the grabbing tool when the grabbing tool contacts the 3D model of the object, and the normal vector of the grabbing tool is arranged on the central axis of the grabbing tool.
In some embodiments, the parsing criteria includes shape closure parsing criteria for evaluating flatness of a first surface including grabbing points on a 3D model of the object.
In some embodiments, the edge of the first surface may be outside of a projected contour of the inner diameter of the gripping tool onto the 3D model of the object. For example, when the outer diameter and the inner diameter of the bottom of the suction cup differ significantly, the edge of the first surface is outside the projected contour of the suction cup on the 3D model of the object.
In some embodiments, the first surface may be a projection of the gripping tool onto a 3D model of the object.
In some embodiments, the edge of the first surface may be outside of the projected contour of the gripping tool on the 3D model of the object.
In some embodiments, the flatness of the first surface may be determined from the angle between the normal vectors of points on the first surface. For example, it may be determined by the average of the angles between the grabbing points on the first surface and the normal vectors of the other points. Alternatively, it may be determined by the sum of the angles between the grabbing points on the first surface and the normal vector to the other points. Alternatively, it may be determined from the maximum value of the angle between the grabbing point on the first surface and the normal vector to the other points.
Pits or holes may be present on the first surface, and no points are present at the pits or holes, and normal vectors at the pits or holes are not available. At this point, the angle between the normal vector at the pit or hole and the normal vector at the gripping point can be considered a relatively large value, for example 90 °.
In some embodiments, a flatness threshold may be set, and a gripping quality label for the gripping point is determined based on the flatness threshold and the flatness of the first surface. For example, the flatness threshold is set to 2 °, and if the flatness of the first surface corresponding to the grabbing point is greater than 2 °, the grabbing point may be marked as a negative sample; if the flatness of the first surface corresponding to the grabbing point is less than or equal to 2 °, the grabbing point may be marked as a positive sample.
In some embodiments, the analysis criteria may include a grabbing force analysis criterion for evaluating a difference between a predicted stress condition of the grabbing point and a stress condition of the grabbing point under a preset condition. For example, the difference may be obtained by the minimum euclidean distance as shown in equation (1).
Q=min| y-phi formula (1)
Wherein, gamma is the predicted stress condition of the grabbing point, phi is the stress condition of the grabbing point under the preset condition, and min is the minimum Euclidean distance formula.
In some embodiments, a difference threshold may be set, and then a grasp quality label is determined based on a difference between the predicted force condition of the grasp point and the force condition of the grasp point under the preset condition, and the difference threshold. For example, if the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition is greater than the difference threshold, the grabbing point is marked as a negative sample. If the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition is smaller than or equal to a difference threshold value, marking the grabbing point as a positive sample.
In some embodiments, the grabbing quality label determined according to the shape closure analysis criterion and the grabbing quality label determined according to the grabbing force analysis criterion are used as pre-grabbing quality labels, and the grabbing quality label of the grabbing point is determined according to the two pre-grabbing quality labels. For example, when the pre-grabbing quality labels determined by the two analysis criteria are positive samples, the grabbing quality labels of the grabbing points are marked with the positive samples. And under the other conditions, the grabbing quality labels of the grabbing points are negative samples.
In some embodiments, the gripping quality label may be determined according to a flatness of the first surface determined by the shape closure resolution criterion, and a difference between a predicted stress condition and a preset stress condition of the gripping point determined by the gripping force resolution criterion. For example, the quality of capture labels may be determined by multiplying the flatness value of the first surface by a weight of 1, multiplying the difference between the force conditions by a weight of 2, and comparing the sum to a preset threshold.
S430, outputting the grabbing data set.
The grabbing data set comprises information of the pose of the grabbing tool, grabbing quality labels of grabbing points and point cloud information of the first surface.
The information of the pose of the grabbing tool can be determined through grabbing points and the corresponding relation between grabbing points and the pose of the grabbing tool. For example, the pose of the gripping tool in the camera coordinate system can be determined by the position and normal vector of the gripping point, and then the pose of the gripping tool in the camera coordinate system is converted into the pose of the gripping tool in the mechanical arm base coordinate system by the hand-eye conversion function.
Fig. 5 is a method for constructing a captured data set according to an embodiment of the present application.
S510, acquiring RGB images of the object.
For example, successive multi-frame multi-view RGB images of objects in a scene may be collected.
S520, performing three-dimensional reconstruction according to the RGB image to obtain a 3D model of the object.
The three-dimensional reconstruction from the RGB image may refer to the related description in S410, and the present application is not described herein.
Fig. 6 is a schematic diagram of manual monitoring according to an embodiment of the present application. In the process of performing three-dimensional reconstruction of an object, the 3D model of the object with failed three-dimensional reconstruction can be interfered by manual monitoring. For example, manual monitoring may be performed through a visual interface to delete or correct the 3D model of the object whose reconstruction failed.
In some embodiments, the 3D model of the three-dimensionally reconstructed object is incomplete, which can be removed by manual monitoring and the three-dimensional reconstruction of the object is performed again, so as to obtain a complete 3D model of the object, as shown in fig. 6 (a).
In some embodiments, the 3D model of the three-dimensional reconstructed object may have impurities, and the impurities in the 3D model of the object may be deleted by manual monitoring to obtain a 3D model of the corrected object, as shown in (b) of fig. 6.
S530, acquiring a model of the grabbing tool.
The gripping tool may comprise a suction cup, the model of which may be provided by the suction cup manufacturer.
S540, determining the grabbing quality label of the current grabbing point on the 3D model of the object according to the analysis criteria.
The current grabbing point and the current pose of the grabbing tool have a corresponding relation. The corresponding relation exists between the current grabbing point and the current pose of the grabbing tool, and the situation that the grabbing tool contacts the 3D model of the object under the current pose can be understood that the normal vector of the current grabbing point is the same as the normal vector of the grabbing tool under the current pose, and the normal vector of the grabbing tool is on the central axis of the grabbing tool.
In some embodiments, the parsing criteria includes shape closure parsing criteria for evaluating flatness of a first surface including grabbing points on a 3D model of the object.
In some embodiments, the edge of the first surface may be outside of a projected contour of the inner diameter of the gripping tool onto the 3D model of the object. For example, when the outer diameter and the inner diameter of the bottom of the suction cup differ significantly, the edge of the first surface is outside the projected contour of the suction cup on the 3D model of the object.
In some embodiments, the first surface may be a projection of the gripping tool onto a 3D model of the object.
In some embodiments, the edge of the first surface may be outside of the projected contour of the gripping tool on the 3D model of the object.
In some embodiments, the flatness of the first surface may be determined from the angle between the normal vectors of points on the first surface. Assuming that the gripping tool is circular, the first surface may be an area on the 3D model of the object surrounded by the gripping point as a center and the radius R, and the flatness of the first surface may be determined according to an included angle between the gripping point on the first surface and a normal vector of other points. For example, the first surface may include a grasping point and other n points, and the flatness of the first surface may be determined from an average of angles between the n points and normal vectors of the grasping point. Alternatively, the flatness of the first surface may be determined from the sum of the angles between these n points and the normal vector of the grabbing point. Alternatively, the flatness of the first surface may be determined based on the maximum value of the angle between the n points and the normal vector of the grabbing point.
When a pit or hole is present on the first surface, there may be no point at the pit or hole and thus the normal vector at the pit or hole is not available. At this point, the angle between the normal vector at the pit or hole and the normal vector at the gripping point can be regarded as a relatively large value, for example 90 °.
In some cases, a flatness threshold may be set, and a gripping quality label for the gripping point is determined based on the flatness threshold and the flatness of the first surface. For example, the flatness threshold is set to 2 °, and if the flatness of the first surface corresponding to the grabbing point is greater than 2 °, the grabbing point may be marked as a negative sample; if the flatness of the first surface corresponding to the grabbing point is less than or equal to 2 °, the grabbing point may be marked as a positive sample.
In some embodiments, the analysis criteria include a grabbing force analysis criterion for evaluating a difference between a predicted stress condition of the current grabbing point and a stress condition of the current grabbing point under a preset condition. For example, the difference can be obtained by the minimum euclidean distance shown in equation (1).
Fig. 7 is an abstract structure diagram of a grab point stress analysis according to an embodiment of the present application. The force applied to the gripping point when the gripping tool grips an object is exemplarily described below with reference to fig. 7. Firstly, the grabbing tool can be abstracted into an octagonal cone spring system, and the stress condition of the octagonal cone spring systemIndicating that the octagonal cone spring system has eight force bearing fulcra, each with six degrees of freedom. When the grabbing tool grabs an object, the contact surface of the octagonal cone spring system and the object The stress situation (abstractable as a grabbing point) may take the form of a vector representation of the grabbing suspension space: gW= [ F x ,F y ,F z ,T x ,T y ,T z ]. Wherein F is x 、F y 、F z Respectively represent the stress of the grabbing points in the x, y and z directions, T x 、T y 、T z The torque of the grabbing point along the x, y and z directions is represented respectively, G is the force and moment of the grabbing point in multiple dimensions (such as x, y and z directions), and W is the weight of the grabbing point in multiple dimensions (such as x, y and z directions). The force or moment of the grabbing point in each dimension can be obtained by superposition of the forces or moments of eight stressed fulcra of the octagonal cone spring system in the dimension. For example, the gripping point is forced in the z-directionWherein i=1, 2 … … 8,w i Weight of force applied in z direction for the ith force applied fulcrum, +.>Is the stress of the ith stress pivot point in the z direction. The forces at the eight force bearing fulcra of the octagonal cone spring system are not necessarily the same, nor are the weights. The weight is mainly affected by the smoothness of the contact surface, the friction coefficient, the physical parameters of the gripping tool, and the like.
Thus, the first and second substrates are bonded together,wherein (1)>Weight of force of the ith force bearing point in x direction, +.>For the force of the ith force bearing point in the x direction,/->In the y direction for the ith force bearing point Weight of stress on ∈>Is the stress of the ith stress pivot in the y direction,weight of force of the ith force bearing point in z direction,/th force bearing point in z direction>For the force of the ith force fulcrum in the z direction,/>Weight of moment of ith force bearing point along x direction, +.>Moment of the ith stress pivot along x direction, +.>Weight of moment of ith force bearing point along y direction, +.>Moment of the ith stress pivot along y direction, +.>Weight of moment of ith force bearing point along z direction, +.>Is the moment of the ith pivot point along the z direction.
Force F of grabbing point in z direction N =f V +f G Wherein f V For the suction of the gripping tool to the gripping point in the normal direction, f G In order to grasp the object's pulling force applied to the point in the normal direction. The pulling force of the object may include different situations, for example, the pulling force of the object may be wrapped when the gripping tool grips the object in a horizontal directionIncluding the weight of the object. For another example, when the gripping tool grips an object in a vertical direction, the pulling force of the object may include a friction force of the object with the gripping tool. As another example, the pulling force of the object may also include a disturbance factor force such as air resistance, etc. Ideally, the gripping point is only subjected to an upward force F in the z-direction N I.e. the space vector Φ= [0, f of the grabbing rotation of the grabbing point N ,0,0,0]The gripping tool may grip the object.
The stress condition of the grabbing point obtained through the octagonal cone spring system can be used as a predicted stress condition (namely gamma=g·w), the stress condition of the grabbing point under ideal conditions is used as a stress condition under preset conditions, and the grabbing quality label can be determined according to the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset conditions.
If the predicted stress condition of the grabbing point is wanted, or the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition, the weights of the stress supporting points of the octagon in the directions of x, y and z are needed to be obtained. The weights of the force and the moment of the grabbing point in multiple dimensions or the weights of the force and the moment of eight stressed fulcra corresponding to the grabbing point in multiple dimensions can be obtained through minimum Euclidean distance fitting, and the weights are shown in a formula (2) or a formula (3).
Q=min||G.W- Φ||formula (2)
Alternatively, to prevent overfitting, a regularization term may be introduced to constrain W to solve for W, as shown in equation (4).
Q=min { ||G.W- Φ||+Ω } equation (4)
Wherein Q is the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition, min is the minimum Euclidean distance formula, G is the force and moment of the grabbing point in multiple dimensions, W is the weight of the grabbing point in the multiple dimensions, phi is the force and moment of the grabbing point in the multiple dimensions under the preset condition, and omega is a regularization term.
Therefore, the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition can be calculated according to the fitted formula (2), formula (3) or formula (4).
The regularization term may be obtained by a regularization method, which may include an L0 regularization method, an L1 regularization method, an L2 regularization method, and the like. For example, the number of the cells to be processed, regularized term Ω =λ|| w| 1 Or (b)Where lambda is the regularized term coefficient, i w i 1 Is the absolute value of the weight parameter, +.>Is the square root of the weight parameter.
In some embodiments, a difference threshold may be set, and the quality of gripping labels may be determined based on the difference between the predicted force conditions at the gripping points and the force conditions at the gripping points under the preset conditions, and the difference threshold. For example, if the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition is greater than the difference threshold, the grabbing point is marked as a negative sample. If the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition is smaller than or equal to a difference threshold value, marking the grabbing point as a positive sample.
In some embodiments, the grabbing quality label determined according to the shape closure analysis criterion and the grabbing quality label determined according to the grabbing force analysis criterion are used as pre-grabbing quality labels, and the grabbing quality label of the grabbing point is determined according to the two pre-grabbing quality labels. For example, when the pre-grabbing quality labels determined by the two analysis criteria are positive samples, the grabbing quality labels of the grabbing points are marked with the positive samples. And under the other conditions, the grabbing quality labels of the grabbing points are negative samples.
In some embodiments, the gripping quality label may be determined according to a flatness of the first surface determined by the shape closure resolution criterion, and a difference between a predicted stress condition and a preset stress condition of the gripping point determined by the gripping force resolution criterion. For example, the quality of capture labels may be determined by multiplying the flatness value of the first surface by a weight of 1, multiplying the difference between the force conditions by a weight of 2, and comparing the sum to a preset threshold.
And S550, moving the grabbing tool to the pose corresponding to the next grabbing point.
The gripping tool can move on the surface of the object according to a certain rule so as to move to the pose corresponding to the next gripping point.
S560, determining the grabbing quality label of the next grabbing point according to the analysis criteria.
The process of determining the grabbing quality label of the next grabbing point may refer to S540, and the present application is not described herein.
S570, repeating steps S550 and S560 until the construction of the captured data set is completed.
And continuously moving the grabbing tool to the pose corresponding to the next grabbing point, and determining the grabbing quality label of the next grabbing point. And repeating the steps until the grabbing quality labels of all grabbing points are determined.
When determining to grasp the quality tag according to the parsing criteria, an error tag may be generated. Therefore, when the label is generated, the visual interface can be used for manual monitoring to delete or correct the wrong grabbing quality label, so that the accuracy of grabbing the quality label is improved. For example, the 3D model of the object includes a first surface with a rugged or hole at a certain grabbing point, the sucker and the first surface cannot form a closed space, and the grabbing tool cannot grab the object under the grabbing pose corresponding to the grabbing point. If the grabbing quality label generated at this time is a positive sample, the grabbing quality label can be corrected to be a negative sample through manual monitoring.
S580, outputting the grabbing data set.
The grabbing data set comprises information of the pose of the grabbing tool, grabbing quality labels of grabbing points and point cloud information of the first surface.
The information of the pose of the grabbing tool can be determined through grabbing points and the corresponding relation between grabbing points and the pose of the grabbing tool. For example, the pose of the gripping tool in the camera coordinate system can be determined by the position and normal vector of the gripping point, and then the pose of the gripping tool in the camera coordinate system is converted into the pose of the gripping tool in the mechanical arm base coordinate system by the hand-eye conversion function.
Embodiments of the present application also provide an apparatus for implementing any of the above methods, e.g., an apparatus comprising means to implement the steps performed in any of the above methods.
Fig. 8 is a schematic block diagram of an apparatus for constructing a crawling dataset according to an embodiment of the present application. The apparatus 4000 shown in fig. 8 includes an acquisition unit 4010, a processing unit 4020, and a transceiving unit 4030.
The acquisition unit 4010 and the processing unit 4020 may be configured to execute the method of constructing the captured data set according to the embodiment of the present application.
An acquisition unit 4010 is used for acquiring a 3D model of the object and a model of the grasping tool.
The processing unit 4020 is configured to determine a grabbing quality label of a grabbing point of the 3D model of the object according to an analysis criterion, where the analysis criterion includes a shape closed analysis criterion, and the shape closed analysis criterion is used to evaluate a flatness of a first surface including the grabbing point on the 3D model of the object, where the grabbing point has a corresponding relationship with a pose of a grabbing tool.
The transceiver unit 4030 is configured to output a capture data set, where the capture data set includes information of a pose of a capture tool, a capture quality tag of a capture point, and point cloud information of the first surface.
Optionally, as an embodiment, the gripping tool comprises a suction cup.
Alternatively, as one embodiment, the flatness of the first surface is determined from the angle between the normal vectors of points on the first surface.
Optionally, as an embodiment, the first surface is a projection of the gripping tool on a 3D model of the object.
Optionally, as an embodiment, the edge of the first surface is outside the projected contour of the gripping tool on the 3D model of the object.
Optionally, as an embodiment, the analysis criteria further includes a grabbing force analysis criterion, where the grabbing force analysis criterion is used to evaluate a difference between a predicted stress situation of the grabbing point and a stress situation of the grabbing point under a preset condition.
Alternatively, the difference is: q=min { ||g·w- Φ||+Ω }, where Q is the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition, min is the minimum euclidean distance formula, G is the force and moment of the grabbing point in multiple dimensions, W is the weight of the grabbing point in the multiple dimensions, phi is the force and moment of the grabbing point in the multiple dimensions under the preset condition, and omega is a regularization term.
Alternatively, as an embodiment, the 3D model of the object is obtained from the RGB image via a three-dimensional reconstruction.
Alternatively, as an embodiment, the 3D model of the object is obtained by manual monitoring to delete or correct the 3D model of the three-dimensional reconstruction failure.
Alternatively, as an embodiment, the grabbing quality label of the grabbing point is obtained after manually monitoring to delete or correct the wrong grabbing quality label.
It should be understood that the division of the units in the above apparatus is only a division of a logic function, and may be fully or partially integrated into one physical entity or may be physically separated. Furthermore, units in the apparatus may be implemented in the form of processor-invoked software; the device comprises, for example, a processor, which is connected to a memory, in which instructions are stored, the processor calling the instructions stored in the memory to implement any of the above methods or to implement the functions of the units of the device, wherein the processor is, for example, a general-purpose processor, such as a central processing unit (Central Processing Unit, CPU) or microprocessor, and the memory is a memory within the device or a memory outside the device. Alternatively, the units in the apparatus may be implemented in the form of hardware circuits, and the functions of some or all of the units may be implemented by the design of hardware circuits, which may be understood as one or more processors; for example, in one implementation, the hardware circuit is an application-specific integrated circuit (ASIC), and the functions of some or all of the above units are implemented by the design of the logic relationships of the elements within the circuit; for another example, in another implementation, the hardware circuit may be implemented by a programmable logic device (programmable logic device, PLD), for example, a field programmable gate array (Field Programmable Gate Array, FPGA), which may include a large number of logic gates, and the connection relationship between the logic gates is configured by a configuration file, so as to implement the functions of some or all of the above units. All units of the above device may be realized in the form of processor calling software, or in the form of hardware circuits, or in part in the form of processor calling software, and in the rest in the form of hardware circuits.
In an embodiment of the present application, the processor is a circuit with signal processing capability, and in one implementation, the processor may be a circuit with instruction reading and running capability, such as a central processing unit (Central Processing Unit, CPU), a microprocessor, a graphics processor (graphics processing unit, GPU) (which may be understood as a microprocessor), or a digital signal processor (digital singnal processor, DSP), etc.; in another implementation, the processor may implement a function through a logical relationship of hardware circuitry that is fixed or reconfigurable, e.g., a hardware circuit implemented as an application-specific integrated circuit (ASIC) or a programmable logic device (programmable logic device, PLD), such as an FPGA. In the reconfigurable hardware circuit, the processor loads the configuration document, and the process of implementing the configuration of the hardware circuit may be understood as a process of loading instructions by the processor to implement the functions of some or all of the above units. Furthermore, a hardware circuit designed for artificial intelligence may be used, which may be understood as an ASIC, such as a neural network processing unit (Neural Network Processing Unit, NPU) tensor processing unit (Tensor Processing Unit, TPU), deep learning processing unit (Deep learning Processing Unit, DPU), etc.
It will be seen that each of the units in the above apparatus may be one or more processors (or processing circuits) configured to implement the above method, for example: CPU, GPU, NPU, TPU, DPU, microprocessor, DSP, ASIC, FPGA, or a combination of at least two of these processor forms.
Furthermore, the units in the above apparatus may be integrated together in whole or in part, or may be implemented independently. In one implementation, these units are integrated together and implemented in the form of a system-on-a-chip (SOC). The SOC may include at least one processor for implementing any of the methods above or for implementing the functions of the units of the apparatus, where the at least one processor may be of different types, including, for example, a CPU and an FPGA, a CPU and an artificial intelligence processor, a CPU and a GPU, and the like.
Fig. 9 is a schematic hardware structure of a computing device according to an embodiment of the present application. The computing device 5000 as shown in fig. 9 includes a memory 5001, a processor 5002, a communication interface 5003, and a bus 5004. The memory 5001, the processor 5002, and the communication interface 5003 are communicatively connected to each other via a bus 5004.
The memory 5001 may be a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a random access memory (random access memory, RAM). The memory 5001 may store a program for executing the steps of the method of constructing a crawling dataset of embodiments of the present application by the processor 5002 when the program stored in the memory 5001 is executed by the processor 5002.
The processor 5002 may employ a general-purpose CPU, microprocessor, ASIC, GPU, or one or more integrated circuits for executing associated routines to implement the methods of constructing a captured data set according to embodiments of the present application.
The processor 5002 may also be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the method of constructing a crawling dataset of the present application may be performed by integrated logic circuitry of hardware or instructions in the form of software in the processor 5002.
The processor 5002 may also be a general purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 5001, and the processor 5002 reads information in the memory 5001 and, in combination with its hardware, performs the functions required to be performed by the units included in the computing device shown in fig. 9, or performs the method of constructing a captured data set according to an embodiment of the method of the present application.
The communication interface 5003 enables communication between the computing device 5000 and other devices or communication networks using transceiving means such as, but not limited to, a transceiver. For example, a 3D model of the object and a model of the grasping tool may be acquired through the communication interface 5003. For example, the captured data set may be output through the communication interface 5003.
Bus 5004 may include a path for transferring information between various components of computing device 5000 (e.g., memory 5001, processor 5002, communications interface 5003).
The embodiments of the present application also provide a computer readable medium storing program code for execution by a device, the program code comprising instructions for performing the method of constructing a crawling dataset in the embodiments of the present application.
The embodiments of the present application also provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of constructing a crawling dataset in the embodiments of the present application.
The embodiment of the application also provides a chip which comprises a processor and a data interface, wherein the processor reads the instructions stored in the memory through the data interface, and the method for constructing the grabbing data set in the embodiment of the application is executed.
Optionally, as an implementation manner, the chip may further include a memory, where the memory stores instructions, and the processor is configured to execute the instructions stored on the memory, where the instructions, when executed, are configured to perform a method for constructing a crawling dataset in an embodiment of the present application.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.
It should be understood that the term "and/or" is merely an association relationship describing the associated object, and means that three relationships may exist, for example, a and/or B may mean: there are three cases, a alone, a and B together, and B alone, wherein a, B may be singular or plural. In addition, the character "/" herein generally indicates that the associated object is an "or" relationship, but may also indicate an "and/or" relationship, and may be understood by referring to the context.
In the present application, "at least one" means one or more, and "a plurality" means two or more. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (23)

1. A method of constructing a crawling dataset, comprising:
acquiring a 3D model of an object and a model of a grabbing tool;
determining a grabbing quality label of grabbing points of the 3D model of the object according to an analysis criterion, wherein the analysis criterion comprises a shape closed analysis criterion, the shape closed analysis criterion is used for evaluating the flatness of a first surface comprising the grabbing points on the 3D model of the object, and the grabbing points have a corresponding relation with the pose of the grabbing tool;
and outputting the grabbing data set, wherein the grabbing data set comprises information of the pose of the grabbing tool, grabbing quality labels of grabbing points and point cloud information of the first surface.
2. The method of claim 1, wherein the gripping tool comprises a suction cup.
3. A method according to claim 1 or 2, wherein the flatness of the first surface is determined from the angle between the normal vectors of points on the first surface.
4. A method according to any one of claims 1 to 3, wherein the first surface is a projection of the gripping tool onto a 3D model of the object.
5. A method according to any one of claims 1 to 3, wherein the edge of the first surface is outside the projected contour of the gripping tool on the 3D model of the object.
6. The method of any one of claims 1 to 5, wherein the analysis criteria further comprises a grab force analysis criteria for evaluating a difference between a predicted stress condition of the grab point and a stress condition of the grab point under a preset condition.
7. The method of claim 6, wherein the difference is:
Q=min{||G·W-Φ||+Ω},
wherein Q is the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition, min is the minimum Euclidean distance formula, G is the force and moment of the grabbing point in multiple dimensions, W is the weight of the grabbing point in the multiple dimensions, phi is the force and moment of the grabbing point in the multiple dimensions under the preset condition, and omega is a regularization term.
8. The method according to any one of claims 1 to 7, wherein the 3D model of the object is obtained from RGB images via three-dimensional reconstruction.
9. The method according to any one of claims 1 to 8, wherein the 3D model of the object is obtained by manual monitoring to delete or correct the 3D model of the three-dimensional reconstruction failure.
10. The method according to any one of claims 1 to 9, wherein the grabbing quality labels of the grabbing points are obtained after manual monitoring to delete or correct erroneous grabbing quality labels.
11. An apparatus for constructing a captured data set, comprising:
an acquisition unit for acquiring a 3D model of the object and a model of the gripping tool;
the processing unit is used for determining grabbing quality labels of grabbing points of the 3D model of the object according to analysis criteria, the analysis criteria comprise shape sealing analysis criteria, the shape sealing analysis criteria are used for evaluating flatness of a first surface of the 3D model of the object, and the grabbing points have corresponding relations with the pose of the grabbing tool;
and the receiving and transmitting unit is used for outputting the grabbing data set, wherein the grabbing data set comprises information of the pose of the grabbing tool, grabbing quality labels of grabbing points and point cloud information of the first surface.
12. The apparatus of claim 11, wherein the grasping tool comprises a suction cup.
13. The apparatus of claim 11 or 12, wherein the flatness of the first surface is determined from an angle between normal vectors of points on the first surface.
14. The apparatus of any one of claims 11 to 13, wherein the first surface is a projection of the gripping tool onto a 3D model of the object.
15. The apparatus of any one of claims 11 to 13, wherein an edge of the first surface is outside a projected contour of the gripping tool on the 3D model of the object.
16. The apparatus according to any one of claims 11 to 15, wherein the analysis criteria further comprises a grabbing force analysis criterion for evaluating a difference between a predicted stress situation of the grabbing point and a stress situation of the grabbing point under a preset condition.
17. The apparatus of claim 16, wherein the difference is:
Q=min{||G·W-Φ||+Ω},
wherein Q is the difference between the predicted stress condition of the grabbing point and the stress condition of the grabbing point under the preset condition, min is the minimum Euclidean distance formula, G is the force and moment of the grabbing point in multiple dimensions, W is the weight of the grabbing point in the multiple dimensions, phi is the force and moment of the grabbing point in the multiple dimensions under the preset condition, and omega is a regularization term.
18. The apparatus according to any one of claims 11 to 17, wherein the 3D model of the object is obtained from RGB images via three-dimensional reconstruction.
19. The apparatus according to any one of claims 11 to 18, wherein the 3D model of the object is obtained by manual monitoring to delete or correct the 3D model of the three-dimensional reconstruction failure.
20. The apparatus according to any one of claims 11 to 19, wherein the grabbing quality labels of the grabbing points are obtained after manual monitoring to delete or correct erroneous grabbing quality labels.
21. A computer readable medium, characterized in that the computer readable medium stores a program code which, when run on a computer, causes the computer to perform the method of any of claims 1 to 10.
22. A chip system, comprising: at least one processor and a memory, the at least one processor coupled with the memory for reading and executing instructions in the memory to perform the method of any of claims 1-10.
23. A computing device, comprising: at least one processor and a memory, the at least one processor coupled with the memory for reading and executing instructions in the memory to perform the method of any of claims 1-10.
CN202210600292.7A 2022-05-30 2022-05-30 Method and device for constructing grabbing data set Pending CN117197230A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210600292.7A CN117197230A (en) 2022-05-30 2022-05-30 Method and device for constructing grabbing data set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210600292.7A CN117197230A (en) 2022-05-30 2022-05-30 Method and device for constructing grabbing data set

Publications (1)

Publication Number Publication Date
CN117197230A true CN117197230A (en) 2023-12-08

Family

ID=88987407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210600292.7A Pending CN117197230A (en) 2022-05-30 2022-05-30 Method and device for constructing grabbing data set

Country Status (1)

Country Link
CN (1) CN117197230A (en)

Similar Documents

Publication Publication Date Title
Wang et al. 3d shape perception from monocular vision, touch, and shape priors
Guo et al. Object discovery and grasp detection with a shared convolutional neural network
CN109870983B (en) Method and device for processing tray stack image and system for warehousing goods picking
Cao et al. Suctionnet-1billion: A large-scale benchmark for suction grasping
CN111695562B (en) Autonomous robot grabbing method based on convolutional neural network
CN111507222B (en) Three-dimensional object detection frame based on multisource data knowledge migration
CN109829476B (en) End-to-end three-dimensional object detection method based on YOLO
CN111368759B (en) Monocular vision-based mobile robot semantic map construction system
Kiatos et al. A geometric approach for grasping unknown objects with multifingered hands
Zhang et al. A CNN-based grasp planning method for random picking of unknown objects with a vacuum gripper
Cheng et al. A robot grasping system with single-stage anchor-free deep grasp detector
Zhuang et al. Instance segmentation based 6D pose estimation of industrial objects using point clouds for robotic bin-picking
Aktaş et al. Deep dexterous grasping of novel objects from a single view
Khellal et al. Pedestrian classification and detection in far infrared images
CN114663880A (en) Three-dimensional target detection method based on multi-level cross-modal self-attention mechanism
CN112288809B (en) Robot grabbing detection method for multi-object complex scene
CN113420648A (en) Target detection method and system with rotation adaptability
CN113160315A (en) Semantic environment map representation method based on dual quadric surface mathematical model
CN116189269A (en) Multitasking face detection method and device, electronic equipment and storage medium
CN117197230A (en) Method and device for constructing grabbing data set
CN117769724A (en) Synthetic dataset creation using deep-learned object detection and classification
JP2021061014A (en) Learning device, learning method, learning model, detector, and gripping system
Li et al. Grasping Detection Based on YOLOv3 Algorithm
Chowdhury et al. Comparison of neural network-based pose estimation approaches for mobile manipulation
Bianchi et al. Latest datasets and technologies presented in the workshop on grasping and manipulation datasets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication