CN114800511A - Dual-stage mechanical arm grabbing planning method and system based on multiplexing structure - Google Patents

Dual-stage mechanical arm grabbing planning method and system based on multiplexing structure Download PDF

Info

Publication number
CN114800511A
CN114800511A CN202210489365.XA CN202210489365A CN114800511A CN 114800511 A CN114800511 A CN 114800511A CN 202210489365 A CN202210489365 A CN 202210489365A CN 114800511 A CN114800511 A CN 114800511A
Authority
CN
China
Prior art keywords
grabbing
posture
attitude
point cloud
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210489365.XA
Other languages
Chinese (zh)
Other versions
CN114800511B (en
Inventor
彭刚
王浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202210489365.XA priority Critical patent/CN114800511B/en
Publication of CN114800511A publication Critical patent/CN114800511A/en
Application granted granted Critical
Publication of CN114800511B publication Critical patent/CN114800511B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1602Programme controls characterised by the control system, structure, architecture
    • B25J9/1605Simulation of manipulator lay-out, design, modelling of manipulator
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1612Programme controls characterised by the hand, wrist, grip control
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • B25J9/1664Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1694Programme controls characterised by use of sensors other than normal servo-feedback from position, speed or acceleration sensors, perception control, multi-sensor controlled systems, sensor fusion
    • B25J9/1697Vision controlled systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Automation & Control Theory (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Orthopedic Medicine & Surgery (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a dual-stage mechanical arm grabbing planning method and system based on a multiplexing structure, wherein the method comprises the following steps: acquiring multi-view data by changing a grabbing scene, generating grabbing postures to form a grabbing posture prediction data set, and training a grabbing posture prediction network of a multiplexing structure until convergence to obtain a grabbing posture prediction model; capturing the point cloud in the capture attitude prediction data set as a capture attitude evaluation data set, and training a capture attitude evaluation network of the multiplexing structure until convergence to obtain a capture attitude evaluation model; inputting the single view point cloud of the scene to be grabbed into a grabbing posture prediction model, inputting the predicted grabbing posture into a grabbing posture evaluation model to obtain quality scores, sorting according to the quality scores, and selecting K grabbing postures which are ranked at the front for guiding the mechanical arm to grab. According to the method, the robust capture of the unknown object in the multi-target stacking scene is realized through two-stage deep learning capture planning based on the multiplexing structure.

Description

Dual-stage mechanical arm grabbing planning method and system based on multiplexing structure
Technical Field
The invention belongs to the technical field of robot application, and particularly relates to a dual-stage mechanical arm grabbing planning method and system based on a multiplexing structure.
Background
In recent years, industrial robots have been used in welding, painting, palletizing, assembling, machining and other manufacturing processes. However, most of the current industrial applications of robots rely on a structured production environment, some fixed track points are collected by a manual teaching method, and some specific actions are completed by the robots according to the preset track points. When the production environment of a factory changes, a large amount of time is spent to gather the track points again, and the industrial production cost is greatly increased. In recent years, demands for unstructured environments such as warehouse logistics and home environments have been increasing, and intelligentization of robot technology has become a mainstream direction. The robot grabbing task is particularly common and important, and is a research hotspot in the field of the current robots, so that the deep research on the robot grabbing technology has important significance for promoting the intelligent process of the robot and improving the production efficiency.
The existing research shows that the docking and grabbing method based on the deep vision has better universality, flexibility and performance, but the robustness of the grabbing task of the robot is influenced by factors such as task environment, algorithm performance, data processing method and data quality. Therefore, the deep research of the robot grabbing planning method based on the deep vision and the design of a robust algorithm model have important significance for promoting the intelligent process of the robot grabbing operation.
Therefore, the technical problems of unreasonable grabbing strategy, poor algorithm robustness and low grabbing success rate exist in the prior art.
Disclosure of Invention
Aiming at the defects or improvement requirements in the prior art, the invention provides a dual-stage mechanical arm grabbing planning method and system based on a multiplexing structure, so that the technical problems of unreasonable grabbing strategy, poor algorithm robustness and low grabbing success rate in the prior art are solved.
In order to achieve the above object, according to an aspect of the present invention, there is provided a dual-stage manipulator grabbing planning method based on a multiplexing structure, including:
inputting the single view point cloud of a scene to be grabbed into a grabbing posture prediction model to obtain grabbing postures corresponding to each point in the point cloud to form a grabbing posture set, inputting the grabbing postures in the grabbing posture set into a grabbing posture evaluation model, sequencing the grabbing postures in the obtained grabbing posture set according to quality scores, and selecting K grabbing postures in the front of the ranking to guide a mechanical arm to grab operation;
the grabbing posture prediction model is obtained by training in the following mode:
acquiring multi-view data by changing a grabbing scene, performing three-dimensional reconstruction on the multi-view data to acquire scene complete point cloud, generating grabbing postures by using the scene complete point cloud to form a grabbing posture prediction data set, and training a grabbing posture prediction network to be convergent by using the grabbing posture prediction data set to obtain a grabbing posture prediction model;
the grabbing posture evaluation model is obtained by training in the following mode:
taking the point cloud which does not collide with the gripper at the tail end of the robot and is positioned in the gripping range of the gripper at the tail end of the robot in the gripping posture prediction data set as a gripping posture evaluation data set, and training a gripping posture evaluation network to be convergent by using the gripping posture evaluation data set to obtain a gripping posture evaluation model;
the grabbing posture prediction network and the grabbing posture evaluation network both adopt a multiplexing structure, and the multiplexing structure indicates that except the first layer network, the input of each subsequent layer network is connected with the input data of the first layer network or the output of the previous layer network.
Furthermore, except the first layer network, the input of each subsequent layer network in the grabbing posture prediction network is connected with the input data of the first layer network.
Further, except for the first layer of network, the input of each subsequent layer of network is connected with the output of the previous layer of network.
Further, the multi-view data comprises a camera pose and a depth image, and is acquired in the following manner:
aiming at a certain grabbing posture, executing a grabbing task, and changing a grabbing scene by reducing the number of objects in the scene;
aiming at a certain grabbing posture, horizontally pushing a certain distance from the periphery to the center direction, and changing the position of an object so as to change a grabbing scene;
aiming at a certain grabbing posture, carrying out grabbing action and then placing the grabbing posture above the center of the scene to enable the grabbing posture to fall freely, so that the grabbing scene is changed;
and randomly selecting one of the modes for changing the grabbing scene, continuing to acquire multi-view data if a certain grabbing gesture is successfully executed, randomly selecting from the remaining modes for changing the grabbing scene if the certain grabbing gesture cannot be successfully planned, and ending the data acquisition of the current round if no grabbing gesture is executed finally.
Further, the specific way of generating the capture gesture by using the scene complete point cloud includes:
randomly sampling n points for any single-view point cloud C' in the scene complete point cloud C to obtain a point set P; aiming at each point in the point set P, acquiring a normal set of all points in a certain radius in the scene complete point cloud C through radius query, thereby establishing a local coordinate system;
aiming at each point in the point set P, acquiring a set P ' of all points in a certain radius in the scene complete point cloud C through radius query, rotating a local coordinate system around a y axis to acquire a grabbing coordinate system, converting each point in the set P ' from a world coordinate system to the grabbing coordinate system, and selecting points in the set P ' which meet the height range of a closed area of a gripper at the tail end of a robot to form a local area point set;
the method comprises the steps that a grabbing coordinate system is retreated for a certain distance along the direction of an x axis away from a grabbed object to obtain an initial grabbing posture position, a plurality of groups of gripper finger positions are arranged along the direction of a y axis, and when point cloud concentrated in a local area point collides with a gripper finger model and a closed area of two parallel fingers of a gripper contains the point cloud, the group of gripper finger positions are used as grabbing positions in the direction of the y axis in the grabbing posture of the point cloud;
selecting a central position from all the grabbing positions in the y-axis direction in the grabbing posture of the point cloud, then advancing to the x-axis direction in a fixed step length in a grabbing coordinate system until the point cloud collides with the two-finger gripper model, and taking the advancing position in the x-axis direction at the moment as the grabbing position in the x-axis direction in the grabbing posture of the point cloud.
Further, the training of the grasp posture prediction model further comprises:
performing quality scoring on the grabbing posture of each point cloud in the local area point set, wherein the quality scoring is calculated in the following mode:
finding the maximum value max and the minimum value min of the point cloud in the local area on the y axis relative to the grabbing gesture, respectively counting the points meeting the conditions that y is more than max-thr and y is less than min-thr as two groups of contact points, calculating the mean value of the position of each group of contact points as a proxy contact point, and thr represents a distance threshold;
according to a vector v formed by two proxy contact points, counting an angle theta between v and a normal line of each point in each group of contact points y The number of the contact points is smaller than the preset angle value, if the number of the contact points is larger than the set value, the contact points are considered to meet the force closing condition when the friction cone angle is theta, and the minimum value theta of the angle of the left contact point meeting the force closing condition is obtained left And the minimum value theta of the right contact point angle right
Calculating a final grabbing posture quality evaluation score according to the following formula:
Figure BDA0003623756550000041
wherein, score left Score of left contact point, score right Score of right contact point, score y The left and right contact points are connected with a score.
And calculating a final grabbing posture quality evaluation score, namely a real score in subsequent training.
Further, the training of the grasp posture prediction model further comprises:
training a grabbing attitude prediction network by using a grabbing attitude prediction data set, taking an error between an output grabbing attitude position prediction offset and a real offset as an offset loss function, taking an error between a unit vector predicting a grabbing attitude in an x direction and a real vector predicting the grabbing attitude in the x direction as an x-direction loss function, taking an error between a unit vector predicting the grabbing attitude in a y direction and a real vector predicting the grabbing attitude in the y direction as a y-direction loss function, taking an error between a predicted grabbing attitude width and a real grabbing attitude width as a mean square error loss function, and taking an error between a predicted grabbing attitude quality evaluation score and a real score as a quality evaluation loss function, and training the grabbing posture prediction network to converge by taking the minimum of an offset loss function, a loss function in the x direction, a loss function in the y direction, a mean square error loss function and a quality evaluation loss function as targets.
The predicted gripping posture position offset, the predicted gripping posture unit vector in the x direction, the predicted gripping posture unit vector in the y direction, the predicted gripping posture width and the predicted gripping posture quality evaluation score are all obtained by predicting a gripping posture prediction model, and the real offset, the real vector in the x direction, the real vector in the y direction, the real gripping posture width and the real score are obtained through correlation calculation.
Further, the quality assessment loss function is:
Figure BDA0003623756550000051
wherein i, j and k respectively represent that the predicted grabbing attitude quality evaluation score is larger than a set large score threshold value, the predicted grabbing attitude quality evaluation score is between a set small score threshold value and a set large score threshold value, and the predicted grabbing attitude quality evaluation score is smaller than a point set n of the set small score threshold value 1 ,n 2 ,n 3 N represents the total number of point clouds in the local region point set, s i ,s j ,s k Respectively representing the predicted grabbing attitude quality evaluation scores corresponding to i, j and k,
Figure BDA0003623756550000052
and respectively representing real scores corresponding to i, j and k, wherein the real scores are final grabbing attitude quality evaluation scores calculated according to a formula. The prediction scores are obtained by capturing the prediction model of the attitude.
Further, the point cloud coordinate in the clamping range of the robot end gripper meets the following conditions:
Figure BDA0003623756550000053
wherein, outer _ diameter is the outer width of the gripper, hand _ depth is the width of the closed area of the gripper, and hand _ height is the height of the closed area of the gripper. (x, y, z) are point cloud coordinates in the capture coordinate system.
According to another aspect of the present invention, there is provided a dual-stage mechanical arm grabbing planning system based on a multiplexing structure, including:
the grabbing attitude prediction model training module is used for changing a grabbing scene to acquire multi-view data, performing three-dimensional reconstruction on the multi-view data to acquire scene complete point cloud, generating grabbing attitudes by using the scene complete point cloud to form a grabbing attitude prediction data set, and training a grabbing attitude prediction network to converge by using the grabbing attitude prediction data set to obtain a grabbing attitude prediction model;
the grabbing posture evaluation model training module is used for taking point clouds which do not collide with the robot tail end clamp holder in the grabbing posture prediction data set and are located in the clamping range of the robot tail end clamp holder as a grabbing posture evaluation data set, and training a grabbing posture evaluation network to be convergent by using the grabbing posture evaluation data set to obtain a grabbing posture evaluation model;
the system comprises a grabbing planning module, a grabbing attitude estimation module and a control module, wherein the grabbing planning module is used for inputting single-view-point cloud of a scene to be grabbed into a grabbing attitude prediction model to obtain grabbing attitudes corresponding to each point in the point cloud to form a grabbing attitude set, inputting the grabbing attitudes in the grabbing attitude set into a grabbing attitude estimation model, sequencing the grabbing attitudes in the obtained grabbing attitude set according to quality scores and selecting K grabbing attitudes which are ranked at the top for guiding a mechanical arm to grab operation;
the grabbing posture prediction network and the grabbing posture evaluation network both adopt a multiplexing structure, and the multiplexing structure indicates that except the first layer network, the input of each subsequent layer network is connected with the input data of the first layer network or the output of the previous layer network.
In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:
(1) according to the invention, real data are acquired in a self-supervision manner, a real grabbing attitude prediction and evaluation data set is constructed, and the grabbing strategy provided by the grabbing attitude prediction and evaluation model is more reasonable due to the dependence of the mechanical arm grabbing planning method on the existing object data set. The invention constructs a grabbing posture prediction and evaluation network based on a multiplexing structure, and improves the utilization rate of shallow network data by connecting the input of a current layer network with the output of a shallow network. And predicting a high-quality grabbing attitude through the grabbing attitude prediction model, and then performing secondary quality evaluation through the grabbing attitude evaluation model to improve the score prediction precision. The invention can guide the mechanical arm to carry out robust grabbing of unknown objects in the unstructured environment, and improves the grabbing success rate of the mechanical arm.
(2) The input of each subsequent layer of network is connected with the input data of the first layer of network except the first layer of network in the captured attitude prediction network, so that the performance of the set network is improved in various aspects compared with a PointNet + + basic network, the errors of an approach angle and a width are respectively reduced by 6.5% and 6%, and the forward propagation time increment of the network with the multiplexing structure is very small compared with the PointNet + + basic network.
(3) The input of each subsequent network except the first network is connected with the output of the previous network, the network performance of the multiplexing structure is improved greatly, the recall rate and the precision are improved by 2.4 percent and 1.8 percent respectively, and the network performance can be improved on the premise of ensuring the real-time performance.
(4) The invention changes the grabbing scene in various ways, thereby acquiring various data which accord with the reality, being better applied to the real grabbing environment and improving the grabbing success rate. The core idea of the grabbing attitude generation method is that the grabbing attitude x direction of a certain point in the point cloud is parallel to the normal vector of the surface of the point cloud, so that the method accords with human intuition and can generate high-quality grabbing attitude.
(5) The invention simplifies the complex force closing condition into the angle between the left contact point and the right contact point and the normal vector and the angle between the connecting line of the left contact point and the right contact point and the y axis, and can efficiently evaluate the grabbing attitude quality. And performing collision detection on the predicted grabbing postures, filtering out low-quality grabbing postures, and then intercepting point clouds in a grabbing posture closed area.
(6) The invention improves the training precision through the constraint of various loss functions, so that the prediction result is more accurate. The high-quality and low-quality grabbing attitude samples in the scene point cloud are unbalanced, the high-quality occupation ratio is small, and the loss function improves the loss ratio of the high-quality grabbing attitude according to the proportion, so that the prediction precision of the high-quality grabbing attitude is improved.
Drawings
Fig. 1 is a flowchart of a two-stage mechanical arm grabbing planning method based on a multiplexing structure according to an embodiment of the present invention;
FIG. 2 is a point cloud obtaining method for a closed area of a gripper according to an embodiment of the present invention;
fig. 3 (a) is a schematic diagram of a first multiplexing structure provided in the embodiment of the present invention;
fig. 3 (b) is a schematic diagram of a second multiplexing structure provided in the embodiment of the present invention;
fig. 4 is a diagram of a grab attitude prediction network structure according to an embodiment of the present invention;
fig. 5 is a diagram of a grab pose estimation network structure according to an embodiment of the present invention;
FIG. 6 is a schematic view of a gripper two-finger model provided by an embodiment of the present invention;
FIG. 7 (a1) is a color diagram of a single-object scene according to an embodiment of the present invention;
fig. 7 (b1) is a point cloud diagram of a single-object scene provided by an embodiment of the present invention;
fig. 7 (c1) is a diagram illustrating the result of the single-object scene grabbing gesture according to the embodiment of the present invention;
FIG. 7 (a2) is a color diagram of a multi-object scene provided by an embodiment of the present invention;
fig. 7 (b2) is a cloud image of multiple object scene points provided by the embodiment of the present invention;
fig. 7 (c2) is a diagram illustrating the multi-object scene capture pose result provided by the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
As shown in fig. 1, a dual-stage mechanical arm grabbing planning method based on a multiplexing structure includes:
inputting a scene single view point cloud to be grabbed into a grabbing gesture prediction model to obtain grabbing gestures corresponding to each point in the point cloud to form a grabbing gesture set, inputting the grabbing gestures in the grabbing gesture set into a grabbing gesture evaluation model, sequencing the grabbing gestures in the obtained grabbing gesture set according to quality scores, and selecting K grabbing gestures ranked at the top for guiding a mechanical arm to grab;
the grabbing posture prediction model is obtained by training in the following way:
acquiring multi-view data by changing a grabbing scene, performing three-dimensional reconstruction on the multi-view data to acquire scene complete point cloud, generating grabbing postures by using the scene complete point cloud to form a grabbing posture prediction data set, and training a grabbing posture prediction network to be convergent by using the grabbing posture prediction data set to obtain a grabbing posture prediction model;
the grabbing posture evaluation model is obtained by training in the following mode:
taking the point cloud which does not collide with the gripper at the tail end of the robot and is positioned in the gripping range of the gripper at the tail end of the robot in the gripping posture prediction data set as a gripping posture evaluation data set, and training a gripping posture evaluation network to be convergent by using the gripping posture evaluation data set to obtain a gripping posture evaluation model;
the grabbing posture prediction network and the grabbing posture evaluation network both adopt a multiplexing structure, and the multiplexing structure indicates that except the first layer network, the input of each subsequent layer network is connected with the input data of the first layer network or the output of the previous layer network.
Specifically, the push strategy includes:
aiming at a certain grabbing posture, a grabbing task is executed, the number of objects in a scene is reduced, and therefore the grabbing scene is changed; aiming at a certain grabbing posture, horizontally pushing a certain distance from the periphery to the center direction, and changing the position of an object so as to change a grabbing scene; aiming at a certain grabbing posture, carrying out grabbing action and then placing the grabbing posture above the center of the scene to enable the grabbing posture to fall freely, so that the grabbing scene is changed;
the pushing strategy is selected randomly during final execution, multi-view data acquisition is continued if a certain grabbing gesture is successfully executed, random selection is performed from the rest strategies if planning is not successful, and data acquisition of the current round is finished if no grabbing gesture can be executed;
the multi-perspective data includes a camera pose and a depth image.
The three-dimensional reconstruction method comprises the following steps:
(1) building a cuboid bounding box: in the region where three-dimensional reconstruction is required, bounding boxes with a length, width and height of L, W, H are established.
(2) Voxelization: dividing a cuboid bounding box into L side lengths υ Can be divided into small squares, i.e. voxel units
Figure BDA0003623756550000091
And (4) each voxel.
(3) Obtaining depth image depth of one frame i And corresponding camera pose
Figure BDA0003623756550000092
(4) Taking a voxel g from the bounding box, converting the voxel g to a point p below the world coordinate system, and calculating the position of the point p in the camera coordinate system
Figure BDA0003623756550000101
And finally, back-projecting to a corresponding pixel point x in the depth image according to the camera internal parameter K.
(5) If the depth value at point x is val (x) and the distance from point v to the origin of the camera coordinate system is di v (v), the tsdf value of voxel g can be calculated according to the following formula:
Figure BDA0003623756550000102
where sdf (g) is the sign distance function value and u is the truncation distance.
(6) The weight of voxel g is calculated according to the formula w (g) cos (θ)/di upsilon (upsilon), where θ is the angle between the projection ray and the normal vector of the p-point surface. And (4) repeating the steps (4) to (6) until all the voxels are traversed.
(7) Fusing the current frame TSDF, the weight W, the global TSDF and the weight W according to the following formula:
Figure BDA0003623756550000103
(8) and (5) repeating the steps (3) to (7) until all depth images are traversed, and outputting the complete scene point cloud by using a ray projection method according to the final TSDF model.
By utilizing the characteristic of high positioning precision of the mechanical arm, the obtained camera pose precision is higher, the problem of poor camera pose estimation precision in three-dimensional reconstruction is solved, and then the three-dimensional reconstruction is carried out on the multi-view depth map and the camera pose by using a TSDF method.
The method for generating the grabbing gesture according to the scene complete point cloud comprises the following steps:
(1) preprocessing an input point cloud: aiming at the complete point cloud C, the specific process is as follows: 1) random sampling; 2) removing invalid points in the point cloud; 3) removing points which are not in the set space; 4) voxelization point cloud; 5) calculating a point cloud surface normal; 6) redefining the normal. And aiming at the single-view point cloud C', randomly sampling n points in the complete point cloud process to obtain a point set P.
(2) Local coordinate system calculation: aiming at each point in the point set P, acquiring a normal set of all points in a certain radius in the complete point cloud C by a radius query method
Figure BDA0003623756550000111
Where n represents the number of dots. Taking N as an input, the calculation formula of the local coordinate system of the point is as follows:
Figure BDA0003623756550000112
the method comprises the following steps that eigen upsilon alue is a characteristic value calculation function, eigen upsilon ector is a characteristic vector calculation function, max and min are maximum and minimum functions respectively, index is an index function, abs function is to ensure that the direction of the vector is the same as the direction of a normal line, M is a square matrix, va is a characteristic value, and xe is a characteristic vector.
(3) Capturing a point cloud of the attitude area, and intercepting: aiming at each point in the point set P, acquiring a set of all points in a certain radius in the complete point cloud C by a radius query method
Figure BDA0003623756550000113
Taking P' and the coordinate system obtained in the step (2) as input, and the specific process is as follows: 1) rotating the coordinate system by 180 degrees around the y axis to obtain a grabbing coordinate system; 2) transforming a coordinate system, namely transforming P' from a world coordinate system to a grabbing coordinate system; 3) satisfies in interception P
Figure BDA0003623756550000114
And (5) obtaining a local area point set by the conditional points.
(4) Evaluation of gripper finger placement position: the specific process is as follows: 1) retreating the coordinate system along the x-axis direction for a certain distance to obtain an initial position of a grabbing posture; 2) a plurality of groups of gripper finger positions are arranged along the y-axis direction; 3) and (3) evaluating whether two finger positions of the gripper are established according to two conditions of whether the point cloud in the local area point set collides with the two finger models and whether the closed area contains the point cloud, and if the two finger positions are not established, indicating that the point position is not feasible to be grabbed.
(5) Calculating the grabbing attitude position: the method mainly aims to obtain a suitable three-dimensional coordinate of the grabbing posture, and the specific process is as follows: 1) adjusting in the y direction, and selecting a central position from all the two positions meeting the conditions in the step (4); 2) and adjusting the x direction, and advancing to the x axis direction by step length until the point cloud in the local area point set collides with the two finger models.
(6) Calculating the width of the grabbing gesture: and (3) acquiring a closed area point cloud by using a holder closed area point cloud acquisition method, then counting the maximum value and the minimum value of all point clouds in the y-axis direction, wherein the absolute value of the difference is the width.
(7) And (3) evaluating the quality of the grabbing posture: aiming at each grabbing gesture, a closed area point cloud is obtained by using a gripper closed area point cloud obtaining method, and then quality evaluation is carried out on the grabbing gestures by using a force closure-based grabbing gesture quality evaluation method.
The grasping posture quality evaluation method based on force closure comprises the following steps:
(1) counting left and right contact points: firstly, finding the maximum value max and the minimum value min of the point cloud in the local point cloud set on the y axis, respectively counting the points meeting the conditions that y is more than max-thr and y is less than min-thr, and then calculating the mean value of the positions of two groups of contact points as a proxy contact point. thr represents a distance threshold.
(2) Calculating the contact point angle: according to a vector v formed by two proxy contact points, counting the number of each group of contact points, wherein the number of the contact points satisfies that the angle between v and the normal of each point is smaller than a certain fixed value theta, if the number is larger than a set value min _ num, the contact point is considered to satisfy a force closing condition when the friction cone angle is theta, and finally solving the minimum theta satisfying the condition min A value, the value being associated with a quality assessment score.
(3) Calculating the fraction: calculating a final grabbing posture quality evaluation score according to the following formula:
Figure BDA0003623756550000121
wherein theta is left 、θ right 、θ y The minimum value of the angle of the left contact point, the minimum value of the angle of the right contact point and the angle value of the y axis of the v and coordinate system are respectively.
As shown in fig. 2, the method for capturing the point cloud in the area of the gesture prediction data set includes:
(1) and (3) coordinate system transformation: because the point cloud and the grabbing postures are under a world coordinate system, the grabbing postures are various, the point cloud of a closed area cannot be conveniently intercepted, and the point cloud needs to be converted into the grabbing posture coordinate system;
(2) collision detection: performing collision detection according to the gripper two-finger model;
(3) point cloud interception: intercepting the point clouds without collision according to the following formula conditions to obtain original local point clouds to form an original local point cloud set;
Figure BDA0003623756550000131
(4) resampling: and randomly sampling the original local point cloud set to a fixed number of point clouds to form a capture attitude evaluation data set.
As shown in fig. 3 (a), the first multiplexing structure processing manner is: except for the first layer of network, the input of each subsequent layer of network is connected with the input data, so that the input data can be repeatedly utilized by each layer of network.
As shown in fig. 3 (b), the second multiplexing structure processing manner is: except for the first layer of network, the input of each subsequent layer of network is connected with the output of the previous layer, so that each layer of network can repeatedly utilize all previous data.
As shown in FIG. 4, the grab pose prediction network outputs include Offset Block, Approach Block, Close Block, Width Block, and Score Block, with the following implications and penalty functions:
the Offset Block aims at predicting the grabbing attitude space Offset, and the specific formula is as follows:
Figure BDA0003623756550000132
wherein
Figure BDA0003623756550000133
Is a certain in the point cloudDot
Figure BDA0003623756550000134
The predicted offset amount of the grasping posture position of (1),
Figure BDA0003623756550000135
is pc i The true offset of (c). The offset is the offset between the gripping posture position and the gripped object position.
The purpose of the Approach Block is to predict a unit vector of the grabbing attitude in the x direction, and to make the unit vector as close as possible to a real vector, so that the loss function can be designed to calculate the angle difference between the predicted vector and the real vector, and the smaller the angle difference is, the better the angle difference is. The specific formula is as follows:
Figure BDA0003623756550000141
wherein
Figure BDA0003623756550000142
Is the predicted vector of the grabbing posture of a certain point in the x direction,
Figure BDA0003623756550000143
is an Approach real unit vector, and norm is a vector normalization function.
The purpose of the Close Block is to predict a unit vector in the y direction of the grabbing attitude, and the specific formula is as follows:
Figure BDA0003623756550000144
wherein
Figure BDA0003623756550000145
Is a prediction vector of a certain point in the y direction of the grabbing posture,
Figure BDA0003623756550000146
is the true vector in the y direction.
The loss function of the Width Block is mean square error loss, and the specific formula is as follows:
Figure BDA0003623756550000147
wherein
Figure BDA0003623756550000148
For a predicted grasp pose width at a point,
Figure BDA0003623756550000149
the width of the gesture is actually grabbed. Similar to the Offset Block,
Figure BDA00036237565500001410
needs to be normalized so that w i Has a value range of [0, 1 ]]。
Score Block predicts the grabbing attitude quality evaluation Score of each point in the point cloud, and the specific formula is as follows:
Figure BDA00036237565500001411
wherein i, j and k respectively represent point set indexes with prediction scores of s being more than 0.7, s being more than 0.01 and less than 0.7 and s being less than 0.01, and n represents the total number of point clouds.
As shown in fig. 5, the grab pose evaluation network outputs Score Block, meaning that the grab pose quality evaluation Score of each point in the predicted point cloud, and the loss function is as follows:
Figure BDA00036237565500001412
wherein s is i Representing the grasping attitude quality evaluation score of the grasping attitude evaluation network prediction corresponding to the intercepted point cloud i,
Figure BDA00036237565500001413
the real score corresponding to i is expressed, and the real score is calculated according to a formulaAnd (4) finally obtaining the gesture quality evaluation score.
Table 1 captured pose prediction network training results with multiplexing structure introduced
Figure BDA0003623756550000151
The point cloud network models in fig. 4 and 5 are multiplexing structures. In order to effectively evaluate the effectiveness of the multiplexing structure provided by the present invention, the multiplexing structure 1 and the multiplexing structure 2 are respectively introduced into a capture attitude prediction network based on PointNet + +, and the experimental results are shown in table 1, and can be analyzed from the data: 1) compared with the PointNet + + basic network performance, the network introduced with the multiplexing structure is improved in all aspects, wherein the overall performance introduced with the multiplexing structure 1 is the best, the angle and width errors of the apreach are respectively reduced by 6.5% and 6%, and the improvement in other aspects is more general; 2) the forward propagation time of the multiplexing structure network introduced with additional model parameters is increased compared with the forward propagation time of the PointNet + + basic network, and the timeliness of the multiplexing structure 2 introduced with more parameters is slightly worse than that of the multiplexing structure 1 introduced with less parameters, but the difference is only about 10 ms.
In summary, the invention selects the PointNet + + network with the multiplexing structure 1 as the capture attitude prediction network.
Table 2 captured pose estimation network training results with introduction of multiplexing structure
Experimental methods Recall rate Accuracy of measurement Fractional error Time/ms
PointNet 0.822 0.835 0.093 2.2
Multiplexing structure 1 0.839 0.841 0.094 2.3
Multiplexing structure 2 0.846 0.853 0.091 2.4
In order to effectively evaluate the effectiveness of the multiplexing structure provided by the invention, a multiplexing structure 1 and a multiplexing structure 2 are respectively introduced into a grabbing posture evaluation network based on PointNet, the experimental results are shown in Table 2, and the following data can be analyzed: 1) in the aspect of network performance, the network performance of the introduced multiplexing structure is improved, wherein the network introduced with the multiplexing structure 2 is improved to the maximum extent, and the recall rate and the precision are respectively improved by 2.4 percent and 1.8 percent; 2) in the aspect of real-time performance, similar to the grabbing posture prediction network, the PointNet, the multiplexing structure 1 and the multiplexing structure 2 are sequentially arranged from high to low, but the maximum difference is only 0.2 ms.
In summary, the PointNet network incorporating the multiplexing structure 2 is preferably selected as the grab pose estimation network.
As shown in FIG. 6, the unshaded area is the gripper two-finger model body, and the shaded area is the closed area of the gripper two parallel fingers. The gripper two-finger model can be described by four parameters (hand _ depth, hand _ height, hand _ width, finger _ width), wherein hand _ depth is the width of the gripper closing region, hand _ height is the height of the gripper closing region, hand _ width is the length of the gripper closing region, and finger _ width is the thickness of the gripper two fingers.
outer _ diameter is the outer width of the holder, equal to hand _ width +2 finger _ width
And backing a certain distance to represent hand _ depth-init _ bit, wherein init _ bit is the initial depth of the holder.
Pushing a certain distance represents any distance of the capture pose to the scene center.
In the embodiment of the present invention, finger _ width is 0.01, hand _ outer _ diameter is 0.11, hand _ depth is 0.06, hand _ height is 0.02, and init _ bit is 0.01.
Example 1
A dual-stage mechanical arm grabbing planning method based on a multiplexing structure comprises the following steps:
the control mechanical arm adopts a pushing strategy to collect multi-view data, and three-dimensional reconstruction is carried out to obtain complete point cloud of a scene;
generating a grabbing attitude according to the scene complete point cloud, acquiring a grabbing attitude prediction data set, intercepting the point cloud in the area of the grabbing attitude prediction data set, and acquiring a grabbing attitude evaluation data set;
training a grabbing posture prediction and evaluation network based on a multiplexing structure according to the grabbing posture prediction and evaluation data set to obtain a grabbing posture prediction and evaluation model;
and taking the single-view point cloud of the scene to be grabbed as an input grabbing attitude prediction, inputting a prediction result into a grabbing attitude evaluation model, and acquiring a group of high-quality grabbing attitudes for guiding the mechanical arm to grab.
The three-dimensional reconstruction method comprises the following steps:
(1) building a cuboid bounding box: and in the area needing three-dimensional reconstruction, bounding boxes with the length, width and height of 40cm, 40cm and 20cm are established respectively.
(2) Voxelization: the cuboid bounding box is divided into small grids with the side length of 2mm, namely voxel units, and a total number of 4000000 voxels can be divided.
(3) Obtaining depth image depth of one frame i And corresponding camera pose
Figure BDA0003623756550000171
(4) Taking a voxel g from the bounding box, converting the voxel g to a point p below the world coordinate system, and calculating the position of the point p in the camera coordinate system
Figure BDA0003623756550000172
And finally, back-projecting to a corresponding pixel point x in the depth image according to the camera internal parameter K.
(5) If the depth value at point x is di upsilon (upsilon) which is the distance from point upsilon to the origin of the camera coordinate system, the tsdf value of voxel g can be calculated according to the following formula:
Figure BDA0003623756550000173
wherein sdf (g) is a value of a symbol distance function, and u is 0.01 m.
(6) The weight of voxel g is calculated according to the formula w (g) cos (θ)/di upsilon (upsilon), where θ is the angle between the projection ray and the normal vector of the p-point surface. And (4) repeating the steps (4) to (6) until all the voxels are traversed.
(7) Fusing the current frame TSDF, the weight W, the global TSDF and the weight W according to the following formula:
Figure BDA0003623756550000174
(8) and (5) repeating the steps (3) to (7) until all depth images are traversed, and outputting the complete scene point cloud by using a ray projection method according to the final TSDF model.
The method for generating the grabbing gesture according to the scene complete point cloud comprises the following steps:
(1) preprocessing an input point cloud: aiming at the complete point cloud C, the specific process is as follows: 1) random sampling; 2) removing invalid points in the point cloud; 3) removing points which are not in the set space; 4) voxelization point cloud; 5) calculating a point cloud surface normal; 6) redefining the normal. For the single-view point cloud C', 2048 points are randomly sampled in the process of the complete point cloud to obtain a point set P.
(2) Local coordinate system calculation: aiming at each point in the point set P, acquiring a normal set of all points in a certain radius in the complete point cloud C by a radius query method
Figure BDA0003623756550000181
Where n represents the number of dots. Taking N as an input, the calculation formula of the local coordinate system of the point is as follows:
Figure BDA0003623756550000182
the eigen upsueue is a characteristic value calculation function, the eigen upsuecter is a characteristic vector calculation function, max and min are maximum and minimum functions respectively, index is an index function, and the abs function ensures that the direction of a vector to be calculated is the same as the direction of a normal.
(3) Capturing a point cloud of the attitude area, and intercepting: aiming at each point in the point set P, acquiring a set of all points in a certain radius in the complete point cloud C by a radius query method
Figure BDA0003623756550000183
Taking the coordinate system obtained in the step (2) and the P' as input, and the specific process is as follows: 1) rotating the coordinate system by 180 degrees around the y axis to obtain a grabbing coordinate system; 2) transforming a coordinate system, namely transforming P' from a world coordinate system to a grabbing coordinate system; 3) satisfies in interception P
Figure BDA0003623756550000184
And (5) obtaining a local area point set by the conditional points.
(4) Evaluation of gripper finger placement position: the specific process is as follows: 1) retreating the coordinate system along the x-axis direction by 0.05m to obtain an initial position of a grabbing posture; 2) 5 groups of gripper two-finger positions are arranged along the y-axis direction; 3) and (3) evaluating whether two positions of the gripper are established according to two conditions of whether the point cloud collides with the two-finger model and whether the closed area contains the point cloud, and if 5 groups of the two positions are not established, indicating that the point position is not feasible to be grabbed.
(5) Calculating the grabbing attitude position: the method mainly aims to obtain a suitable three-dimensional coordinate of the grabbing posture, and the specific process is as follows: 1) adjusting in the y direction, and selecting a central position from all the two positions meeting the conditions in the step (4); 2) and adjusting in the x direction, and advancing in the x axis direction by the step size of 0.005m until the point cloud collides with the two finger models.
(6) Calculating the grabbing attitude width: and (3) acquiring a closed area point cloud by using a holder closed area point cloud acquisition method, then counting the maximum value and the minimum value of all point clouds in the y-axis direction, wherein the absolute value of the difference is the width.
(7) And (3) evaluating the quality of the grabbing posture: aiming at each grabbing gesture, a closed area point cloud is obtained by using a gripper closed area point cloud obtaining method, and then quality evaluation is carried out on the grabbing gestures by using a force closure-based grabbing gesture quality evaluation method.
Further, the grabbing posture quality evaluation method based on force closure is as follows:
(1) counting left and right contact points: firstly, finding the maximum value and the minimum value on the y axis, respectively counting the points meeting the conditions that y is more than max-0.003 and y is less than min-0.003, and then calculating the mean value of the positions of two groups of contact points as proxy contact points.
(2) Calculating the contact point angle: according to a vector v formed by two proxy contact points, counting the number of each group of contact points, wherein the angle between the contact points and the normal line of each point is smaller than a certain fixed value theta, if the number is larger than 2, the contact point is considered to meet a force closing condition when the friction cone angle is theta, and finally solving the minimum theta meeting the condition min A value, the value being associated with a quality assessment score.
(3) Calculating the fraction: calculating a final grabbing posture quality evaluation score according to the following formula:
Figure BDA0003623756550000191
wherein theta is left 、θ right 、θ y The minimum value of the angle of the left contact point, the minimum value of the angle of the right contact point and the angle value of the y axis of the v and coordinate system are respectively.
The method for intercepting and capturing the regional point cloud of the attitude prediction data set comprises the following steps:
(1) and (3) coordinate system transformation: because the point cloud and the grabbing postures are under a world coordinate system, the grabbing postures are various, the point cloud of a closed area cannot be conveniently intercepted, and the point cloud needs to be converted into the grabbing posture coordinate system;
(2) collision detection: performing collision detection according to the gripper simplified model;
(3) point cloud interception: intercepting the point cloud according to the following formula conditions to obtain an original local point cloud;
Figure BDA0003623756550000201
(4) resampling: 256 size point clouds were randomly sampled.
The deep learning-based two-stage grabbing planning method comprises the following steps:
(1) pretreatment: the method comprises the steps of point cloud coordinate transformation, voxelization down-sampling point cloud, working space filtering point cloud and proportional sampling point cloud;
(2) and (3) predicting the grabbing attitude: inputting the preprocessed single view point cloud into an improved grabbing attitude prediction model to obtain a grabbing attitude corresponding to each point in the point cloud;
(3) and (3) post-treatment: the method comprises the steps of filtering and grabbing postures of a working space, filtering and grabbing postures in a direction, clustering the grabbing postures and sequencing the grabbing postures;
(4) and (3) calculating the grabbing attitude: recalculating the gripping pose of the post-processed high-quality gripping pose to replace the predicted gripping pose, wherein the calculation method is a simplified version of the method in claim 3, and the gripping pose quality score labeling process is reduced compared with the original version;
(5) and (3) evaluation of grabbing postures: for each grabbing gesture, firstly intercepting local point clouds of the grabbing gestures and inputting the local point clouds into a grabbing gesture evaluation model to obtain a high-quality grabbing gesture set after reevaluation;
(6) and (4) quality sorting and selecting, sorting according to the quality scores and selecting 5 grabbing postures which are ranked at the top.
As shown in fig. 7 (a1), fig. 7 (b1), fig. 7 (c1), fig. 7 (a2), fig. 7 (b2), and fig. 7 (c2), the grab planning method provided by the present invention can work effectively in both single-item and multi-item scenarios, and has high robustness. In addition, the method can effectively filter out the areas crowded with objects according to the distribution of the objects in the scene, and preferentially search for the independent objects beneficial to grabbing, so that a grabbing operation space is reserved for the mechanical arm end effector, and the grabbing habit of people is met.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A dual-stage mechanical arm grabbing planning method based on a multiplexing structure is characterized by comprising the following steps:
inputting the single view point cloud of a scene to be grabbed into a grabbing posture prediction model to obtain grabbing postures corresponding to each point in the point cloud to form a grabbing posture set, inputting the grabbing postures in the grabbing posture set into a grabbing posture evaluation model, sequencing the grabbing postures in the obtained grabbing posture set according to quality scores, and selecting K grabbing postures in the front of the ranking to guide a mechanical arm to grab operation;
the grabbing posture prediction model is obtained by training in the following mode:
acquiring multi-view data by changing a grabbing scene, performing three-dimensional reconstruction on the multi-view data to acquire scene complete point cloud, generating grabbing postures by using the scene complete point cloud to form a grabbing posture prediction data set, and training a grabbing posture prediction network to be convergent by using the grabbing posture prediction data set to obtain a grabbing posture prediction model;
the grabbing posture evaluation model is obtained by training in the following mode:
taking the point cloud which does not collide with the gripper at the tail end of the robot and is positioned in the gripping range of the gripper at the tail end of the robot in the gripping posture prediction data set as a gripping posture evaluation data set, and training a gripping posture evaluation network to be convergent by using the gripping posture evaluation data set to obtain a gripping posture evaluation model;
the grabbing posture prediction network and the grabbing posture evaluation network both adopt a multiplexing structure, and the multiplexing structure indicates that except the first layer network, the input of each subsequent layer network is connected with the input data of the first layer network or the output of the previous layer network.
2. The method for planning grabbing of a dual-stage mechanical arm based on a multiplexing structure of claim 1, wherein the inputs of each subsequent layer except the first layer of network in the grabbing attitude prediction network are connected with the input data of the first layer of network.
3. The method for planning grabbing of a dual-stage mechanical arm based on a multiplexing structure of claim 1 or 2, wherein the inputs of each subsequent layer except the first layer of network in the grabbing attitude assessment network are connected with the outputs of the previous layer of network.
4. The method for planning grabbing of a dual-stage mechanical arm based on a multiplexing structure of claim 1 or 2, wherein the multi-view data comprises camera pose and depth images, and is acquired by:
aiming at a certain grabbing posture, executing a grabbing task, and changing a grabbing scene by reducing the number of objects in the scene;
aiming at a certain grabbing posture, horizontally pushing a certain distance from the periphery to the center direction, and changing the position of an object so as to change a grabbing scene;
aiming at a certain grabbing posture, carrying out grabbing action and then placing the grabbing posture above the center of the scene to enable the grabbing posture to fall freely, so that the grabbing scene is changed;
and randomly selecting one of the modes for changing the grabbing scene, continuing to acquire multi-view data if a certain grabbing gesture is successfully executed, randomly selecting from the remaining modes for changing the grabbing scene if the certain grabbing gesture cannot be successfully planned, and ending the data acquisition of the current round if no grabbing gesture is executed finally.
5. The method for planning grabbing of a dual-stage mechanical arm based on a multiplexing structure as claimed in claim 1 or 2, wherein the specific way of generating the grabbing gesture by using the scene complete point cloud comprises:
randomly sampling n points for any single-view point cloud C' in the scene complete point cloud C to obtain a point set P; aiming at each point in the point set P, acquiring a normal set of all points in a certain radius in the scene complete point cloud C through radius query, thereby establishing a local coordinate system;
aiming at each point in the point set P, acquiring a set P ' of all points in a certain radius in the scene complete point cloud C through radius query, rotating a local coordinate system around a y axis to acquire a grabbing coordinate system, converting each point in the set P ' from a world coordinate system to the grabbing coordinate system, and selecting points in the set P ' which meet the height range of a closed area of a gripper at the tail end of a robot to form a local area point set;
the method comprises the steps that a grabbing coordinate system is retreated for a certain distance along the direction of an x axis away from a grabbed object to obtain an initial grabbing posture position, a plurality of groups of gripper finger positions are arranged along the direction of a y axis, and when point cloud concentrated in a local area point collides with a gripper finger model and a closed area of two parallel fingers of a gripper contains the point cloud, the group of gripper finger positions are used as grabbing positions in the direction of the y axis in the grabbing posture of the point cloud;
selecting a central position from all the grabbing positions in the y-axis direction in the grabbing posture of the point cloud, then advancing to the x-axis direction in a fixed step length in a grabbing coordinate system until the point cloud collides with the two-finger gripper model, and taking the advancing position in the x-axis direction at the moment as the grabbing position in the x-axis direction in the grabbing posture of the point cloud.
6. The method for planning the grabbing of a dual-stage manipulator based on a multiplexing structure of claim 5, wherein the training of the grabbing pose prediction model further comprises:
performing quality scoring on the grabbing posture of each point cloud in the local area point set, wherein the quality scoring is calculated in the following mode:
finding the maximum value max and the minimum value min of the point cloud in the local area on the y axis relative to the grabbing gesture, respectively counting the points meeting the conditions that y is more than max-thr and y is less than min-thr as two groups of contact points, calculating the mean value of the position of each group of contact points as a proxy contact point, and thr represents a distance threshold;
according to a vector v formed by two proxy contact points, counting an angle theta between v and a normal line of each point in each group of contact points y The number of the contact points is smaller than the preset angle value, if the number of the contact points is larger than the set value, the contact points are considered to meet the force closing condition when the friction cone angle is theta, and the minimum value theta of the angle of the left contact point meeting the force closing condition is obtained left And the minimum value theta of the right contact point angle right
Calculating a final grabbing posture quality evaluation score according to the following formula:
Figure FDA0003623756540000031
wherein, score left Score of left contact point, score right Score of right contact point, score y The left and right contact points are connected with a score.
7. The method for planning the grabbing of a dual-stage manipulator based on a multiplexing structure of claim 6, wherein the training of the grabbing pose prediction model further comprises:
training a grabbing attitude prediction network by using a grabbing attitude prediction data set, taking an error between an output grabbing attitude position prediction offset and a real offset as an offset loss function, taking an error between a unit vector predicting a grabbing attitude in an x direction and a real vector predicting the grabbing attitude in the x direction as an x-direction loss function, taking an error between a unit vector predicting the grabbing attitude in a y direction and a real vector predicting the grabbing attitude in the y direction as a y-direction loss function, taking an error between a predicted grabbing attitude width and a real grabbing attitude width as a mean square error loss function, and taking an error between a predicted grabbing attitude quality evaluation score and a real score as a quality evaluation loss function, and training the grabbing posture prediction network to converge by taking the minimum of an offset loss function, a loss function in the x direction, a loss function in the y direction, a mean square error loss function and a quality evaluation loss function as targets.
8. The method for planning grabbing of a dual-stage robotic arm based on a multiplexing structure of claim 7, wherein the quality assessment loss function is:
Figure FDA0003623756540000041
wherein i, j and k respectively represent that the predicted grabbing attitude quality evaluation score is larger than a set large score threshold value, the predicted grabbing attitude quality evaluation score is between a set small score threshold value and a set large score threshold value, and the predicted grabbing attitude quality evaluation score is smaller than a point set n of the set small score threshold value 1 ,n 2 ,n 3 N represents the total number of point clouds in the local region point set, s i ,s j ,s k Respectively representing the predicted grabbing attitude quality evaluation scores corresponding to i, j and k,
Figure FDA0003623756540000042
and respectively representing real scores corresponding to i, j and k, wherein the real scores are final grabbing attitude quality evaluation scores calculated according to a formula.
9. The method for planning grabbing of a dual-stage mechanical arm based on a multiplexing structure as claimed in claim 1 or 2, wherein the point cloud coordinates in the clamping range of the robot end gripper satisfy the following conditions:
Figure FDA0003623756540000043
wherein, outer _ diameter is the outer width of the gripper, hand _ depth is the width of the closed area of the gripper, and hand _ height is the height of the closed area of the gripper.
10. A dual-stage mechanical arm grabbing planning system based on a multiplexing structure is characterized by comprising:
the grabbing attitude prediction model training module is used for changing a grabbing scene to acquire multi-view data, performing three-dimensional reconstruction on the multi-view data to acquire scene complete point cloud, generating grabbing attitudes by using the scene complete point cloud to form a grabbing attitude prediction data set, and training a grabbing attitude prediction network to converge by using the grabbing attitude prediction data set to obtain a grabbing attitude prediction model;
the grabbing posture evaluation model training module is used for taking point clouds which do not collide with the robot tail end clamp holder in the grabbing posture prediction data set and are located in the clamping range of the robot tail end clamp holder as a grabbing posture evaluation data set, and training a grabbing posture evaluation network to be convergent by using the grabbing posture evaluation data set to obtain a grabbing posture evaluation model;
the system comprises a grabbing planning module, a grabbing attitude estimation module and a control module, wherein the grabbing planning module is used for inputting single-view-point cloud of a scene to be grabbed into a grabbing attitude prediction model to obtain grabbing attitudes corresponding to each point in the point cloud to form a grabbing attitude set, inputting the grabbing attitudes in the grabbing attitude set into a grabbing attitude estimation model, sequencing the grabbing attitudes in the obtained grabbing attitude set according to quality scores and selecting K grabbing attitudes which are ranked at the top for guiding a mechanical arm to grab operation;
the grabbing posture prediction network and the grabbing posture evaluation network both adopt a multiplexing structure, and the multiplexing structure indicates that except the first layer network, the input of each subsequent layer network is connected with the input data of the first layer network or the output of the previous layer network.
CN202210489365.XA 2022-04-29 2022-04-29 Double-stage mechanical arm grabbing planning method and system based on multiplexing structure Active CN114800511B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210489365.XA CN114800511B (en) 2022-04-29 2022-04-29 Double-stage mechanical arm grabbing planning method and system based on multiplexing structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210489365.XA CN114800511B (en) 2022-04-29 2022-04-29 Double-stage mechanical arm grabbing planning method and system based on multiplexing structure

Publications (2)

Publication Number Publication Date
CN114800511A true CN114800511A (en) 2022-07-29
CN114800511B CN114800511B (en) 2023-11-14

Family

ID=82511859

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210489365.XA Active CN114800511B (en) 2022-04-29 2022-04-29 Double-stage mechanical arm grabbing planning method and system based on multiplexing structure

Country Status (1)

Country Link
CN (1) CN114800511B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116533236A (en) * 2023-05-09 2023-08-04 北京航空航天大学 Service robot operation evaluation strategy based on discrete working space

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102017108727A1 (en) * 2017-04-24 2018-10-25 Roboception Gmbh Method for creating a database with gripper poses, method for controlling a robot, computer-readable storage medium and handling system
US20200086483A1 (en) * 2018-09-15 2020-03-19 X Development Llc Action prediction networks for robotic grasping
CN112819135A (en) * 2020-12-21 2021-05-18 中国矿业大学 Sorting method for guiding mechanical arm to grab materials in different poses based on ConvPoint model
CN113192128A (en) * 2021-05-21 2021-07-30 华中科技大学 Mechanical arm grabbing planning method and system combined with self-supervision learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102017108727A1 (en) * 2017-04-24 2018-10-25 Roboception Gmbh Method for creating a database with gripper poses, method for controlling a robot, computer-readable storage medium and handling system
US20200086483A1 (en) * 2018-09-15 2020-03-19 X Development Llc Action prediction networks for robotic grasping
CN112819135A (en) * 2020-12-21 2021-05-18 中国矿业大学 Sorting method for guiding mechanical arm to grab materials in different poses based on ConvPoint model
CN113192128A (en) * 2021-05-21 2021-07-30 华中科技大学 Mechanical arm grabbing planning method and system combined with self-supervision learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张森彦;田国会;张营;刘小龙;: "一种先验知识引导的基于二阶段渐进网络的自主抓取策略", 机器人, no. 05 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116533236A (en) * 2023-05-09 2023-08-04 北京航空航天大学 Service robot operation evaluation strategy based on discrete working space
CN116533236B (en) * 2023-05-09 2024-04-12 北京航空航天大学 Service robot operation evaluation strategy based on discrete working space

Also Published As

Publication number Publication date
CN114800511B (en) 2023-11-14

Similar Documents

Publication Publication Date Title
CN112297013B (en) Robot intelligent grabbing method based on digital twin and deep neural network
CN111251295B (en) Visual mechanical arm grabbing method and device applied to parameterized parts
CN110298886B (en) Dexterous hand grabbing planning method based on four-stage convolutional neural network
CN111243017B (en) Intelligent robot grabbing method based on 3D vision
CN108972494A (en) A kind of Apery manipulator crawl control system and its data processing method
Lundell et al. Ddgc: Generative deep dexterous grasping in clutter
CN112509063A (en) Mechanical arm grabbing system and method based on edge feature matching
CN114571153B (en) Weld joint identification and robot weld joint tracking method based on 3D point cloud
CN112669385B (en) Industrial robot part identification and pose estimation method based on three-dimensional point cloud features
CN110378325B (en) Target pose identification method in robot grabbing process
CN113192128A (en) Mechanical arm grabbing planning method and system combined with self-supervision learning
JP2022187984A (en) Grasping device using modularized neural network
CN114800511B (en) Double-stage mechanical arm grabbing planning method and system based on multiplexing structure
JP2022187983A (en) Network modularization to learn high dimensional robot tasks
CN116673963A (en) Double mechanical arm cooperation flexible assembly system and method for unordered breaker parts
CN115861780B (en) Robot arm detection grabbing method based on YOLO-GGCNN
CN113436293B (en) Intelligent captured image generation method based on condition generation type countermeasure network
CN113822933B (en) ResNeXt-based intelligent robot grabbing method
CN115284279A (en) Mechanical arm grabbing method and device based on aliasing workpiece and readable medium
CN115194774A (en) Binocular vision-based control method for double-mechanical-arm gripping system
Cao et al. Grasp pose detection based on shape simplification
Xiao et al. Dexterous robotic hand grasp modeling using piecewise linear dynamic model
Tao et al. An improved RRT algorithm for the motion planning of robot manipulator picking up scattered piston
Xu et al. Vision‐Based Intelligent Perceiving and Planning System of a 7‐DoF Collaborative Robot
Fang et al. A pick-and-throw method for enhancing robotic sorting ability via deep reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant