CN113580149A - Unordered aliasing workpiece grabbing method and system based on key point prediction network - Google Patents

Unordered aliasing workpiece grabbing method and system based on key point prediction network Download PDF

Info

Publication number
CN113580149A
CN113580149A CN202111156483.0A CN202111156483A CN113580149A CN 113580149 A CN113580149 A CN 113580149A CN 202111156483 A CN202111156483 A CN 202111156483A CN 113580149 A CN113580149 A CN 113580149A
Authority
CN
China
Prior art keywords
coordinate system
workpiece
pixel
key point
robot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111156483.0A
Other languages
Chinese (zh)
Other versions
CN113580149B (en
Inventor
王耀南
伍俊岚
朱青
刘学兵
周鸿敏
毛建旭
周显恩
吴成中
冯明涛
曾琼
童琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202111156483.0A priority Critical patent/CN113580149B/en
Publication of CN113580149A publication Critical patent/CN113580149A/en
Application granted granted Critical
Publication of CN113580149B publication Critical patent/CN113580149B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1602Programme controls characterised by the control system, structure, architecture
    • B25J9/161Hardware, e.g. neural networks, fuzzy logic, interfaces, processor
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J19/00Accessories fitted to manipulators, e.g. for monitoring, for viewing; Safety devices combined with or specially adapted for use in connection with manipulators
    • B25J19/02Sensing devices
    • B25J19/04Viewing devices
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1628Programme controls characterised by the control loop
    • B25J9/163Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • B25J9/1661Programme controls characterised by programming, planning systems for manipulators characterised by task planning, object-oriented languages
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1679Programme controls characterised by the tasks executed

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Automation & Control Theory (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a disordered aliasing workpiece grabbing method and system based on a key point prediction network, wherein a real-time RGB image is input, the position of each workpiece can be divided through a preset key point prediction network model and the key point position of each workpiece can be predicted, so that the pixel coordinate of each key point in the image is obtained, the conversion relation between a workpiece model coordinate system and a camera coordinate system is solved by combining the 3D coordinate of each key point in the workpiece model coordinate system and camera internal parameters, the conversion relation between the camera coordinate system and a robot coordinate system is obtained by combining hand-eye calibration, and then the 6DoF position and pose information of the workpiece in the robot coordinate system are obtained by solving. The method can predict the pixel position which most possibly represents the key point through voting under the condition that the key point is blocked, solves the difficult problem of calculating the blocked pose of the key point under the condition of workpiece aliasing, enables the robot to realize the workpiece picking function under a more complex scene, and effectively improves the picking success rate.

Description

Unordered aliasing workpiece grabbing method and system based on key point prediction network
Technical Field
The invention belongs to the technical field of intelligent robots, and relates to a method and a system for capturing disordered aliasing workpieces based on a key point prediction network.
Background
The vision technology occupies an important position in the application of the industrial robot, and under the large background that the industrial robot needs to be large in scale, the vision enhancement technology promotes the intelligent industrial robot to adapt to more complex scenes, solves complex problems and has a huge market prospect. Industrial sorting systems are an important part of industrial robot technology development processes, and nowadays, in various industrial production lines, the robot technology is gradually replacing the traditional manual operation mode.
Most of the existing industrial automatic sorting systems complete sorting tasks by programming industrial robots in advance, long-time repeated operation can be realized in the mode, but the placing positions of sorting objects must be strictly set, and the robots cannot deal with flexibly-changed scenes in the mode. At present, multiple kinds of multi-part random placing scenes exist in an industrial automatic sorting system, and the classification and placing of parts in the existing mode mostly depend on manual work, so that full-automatic production cannot be realized. Facing the market demand, the research of robot independent sorting facing complex scenes is of great significance. The pose estimation technology is very effective for solving the problem of picking up target parts in a scene of disordered placement of stacked parts in a machine vision industrial sorting system, but still has a plurality of problems to be overcome, for example, under the conditions that the surfaces of the parts have no texture, the parts are stacked and blocked, the light environment is complex and the like, the pose of a workpiece cannot be accurately calculated, so that the workpiece cannot be accurately picked up.
According to the existing research, generally, workpiece picking can be classified into 2D planar picking and 6dof (six details of freedom) picking according to the difference of the robot working space. The former uses an object detection method to locate the target according to a known operating plane height, while the latter must rely on the 6D pose of the target to complete the pick-up. The traditional target pose estimation method solves the problem of feature matching between a target image and a key point template or global features of the target image to complete an estimation task. However, these methods are sensitive to surface texture and illumination, cannot process non-textured workpieces, and often cannot perform pose prediction in occluded scenes. With the development of deep learning techniques, many studies have combined the pose estimation problem with Convolutional Neural Networks (CNNs), converting it into an end-to-end combination of object detection and pose regression, using RGB or RGB-D images as inputs. Yu Xiang et al, 2017, in Posecnn: A connected neural network for 6D object site estimated networks, proposed a pose estimation network PosecCNN using RGB images as input, using a CNN-based backbone network for feature extraction, and finally using three network branches for object classification, 3D localization and rotational regression. And then based on the popularity of depth cameras in order to better utilize the RGB-D image information. In a patent by Chen Wang et al, Dense fusion 6d object spatial iterative condensation, 2019, two heterogeneous stems were used to extract color and depth features, respectively, and then fused together for regression.
The end-to-end based approaches described above are data driven in nature and require a large amount of real data information to train. However, 6DoF pose labeling is a very complex and time-consuming task that requires an accurate three-dimensional model to compute the training loss of the network. Therefore, most of the methods are trained and tested in public data sets, and are not easy to deploy in an actual robot picking system, so that the problem that the picking success rate is low due to the difficulty in calculating the position and posture of the workpiece when the key point of the workpiece is blocked is caused.
Disclosure of Invention
Aiming at the technical problems, the invention provides a method and a system for capturing disordered aliasing workpieces based on a key point prediction network, which can effectively improve the success rate of picking.
The technical scheme adopted by the invention for solving the technical problems is as follows:
the unordered aliasing workpiece grabbing method based on the key point prediction network comprises the following steps:
step S100: calibrating and determining a camera internal reference matrix through a Zhang friend method according to a preset first calibration picture;
step S200: determining a conversion matrix from a camera coordinate system to a robot coordinate system by a nine-point calibration method according to a preset second calibration picture;
step S500: acquiring a real-time image, inputting the real-time image into a preset pixel-level key point prediction network model, and regressing to obtain pixel coordinates of preset key points in the real-time image;
step S600: acquiring 3D coordinates of preset key points in a workpiece model coordinate system, and acquiring a conversion matrix between the workpiece model coordinate system and a camera coordinate system according to a camera internal reference matrix, the 3D coordinates of the preset key points in the workpiece model coordinate system and pixel coordinates of the preset key points in a real-time image;
step S700: acquiring coordinates of a picking point of a workpiece to be picked under a workpiece model coordinate system and initial direction information of robot grabbing equipment, acquiring workpiece initial pose information according to the coordinates of the picking point of the workpiece to be picked under the workpiece model coordinate system and the initial direction information of the robot grabbing equipment, and acquiring 6DoF position and posture information of the workpiece under the robot coordinate system according to the workpiece initial pose information, a conversion matrix from a camera coordinate system to the robot coordinate system and a conversion matrix between the workpiece model coordinate system and the camera coordinate system;
step S800: and controlling the robot grabbing equipment to grab the target workpiece according to the 6DoF position and posture information of the workpiece in the robot coordinate system.
Preferably, step S100 includes:
step S110: shooting images of preset first calibration pictures at different angles by using a camera;
step S120: extracting corner information from each image of a preset first calibration picture with different angles;
step S130: and calibrating by using a Zhang-friend method according to the angular point information, and calculating camera internal reference data to obtain a camera internal reference matrix.
Preferably, the preset second calibration picture includes nine dots, and the step S200 includes:
step S210: shooting an image of a preset second calibration picture placed in a random posture;
step S220: calculating the circle center pixel position of each dot in the image;
step S230: moving a sucker at the tail end of the robot mechanical arm to each round point, and recording a 3D coordinate under a corresponding robot coordinate system;
step S240: repeating the step S210 to the step S230 for a first preset number of times to obtain a group of 2D-3D data of each dot;
step S250: and calculating a conversion matrix from the camera coordinate system to the robot coordinate system according to the 2D-3D data of each dot.
Preferably, step S250 includes:
step S251: and calculating to obtain a rotation matrix between the robot coordinate system and the camera coordinate system and a translation matrix between the robot coordinate system and the camera coordinate system according to the 2D-3D data of each dot, wherein the calculation specifically comprises the following steps:
Figure 465435DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 116996DEST_PATH_IMAGE002
is a scale factor, and is a function of,
Figure 365575DEST_PATH_IMAGE003
representing a rotation matrix between the robot coordinate system to the camera coordinate system,
Figure 432888DEST_PATH_IMAGE004
representing a translation matrix between the robot coordinate system to the camera coordinate system,
Figure 214637DEST_PATH_IMAGE005
is a reference for the camera to be used,
Figure 505941DEST_PATH_IMAGE006
the coordinates of the pixels corresponding to each dot are,
Figure 241816DEST_PATH_IMAGE007
3D coordinates of each dot under a robot coordinate system;
step S252: calculating a transformation matrix from the robot coordinate system to the camera coordinate system according to a rotation matrix from the robot coordinate system to the camera coordinate system and a translation matrix from the robot coordinate system to the camera coordinate system, and specifically:
Figure 909558DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 47278DEST_PATH_IMAGE009
a transformation matrix from a robot coordinate system to a camera coordinate system;
step S253: the method can obtain a conversion matrix from the camera coordinate system to the robot coordinate system according to the conversion matrix from the robot coordinate system to the camera coordinate system, and specifically comprises the following steps:
Figure 306221DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure 529392DEST_PATH_IMAGE011
is a transformation matrix from the camera coordinate system to the robot coordinate system.
Preferably, after step S200 and before step S500, the method further includes:
step S300: building a pixel-level key point prediction network, acquiring a training data set, marking the training data set to obtain a marked data set, and training the pixel-level key point prediction network according to the marked data set to obtain a pixel-level key point prediction network;
step S400: calculating the loss value of the pixel-level key point prediction network according to a preset loss function, performing back propagation to update the network parameters of the pixel-level key point prediction network according to the loss value, and obtaining the updated pixel-level key point prediction network as a preset pixel-level key point prediction network model.
Preferably, the preset pixel-level keypoint prediction network model includes a convolutional neural network, a region candidate network, and four branches, where the four branches are a classification branch, a bounding box acquisition branch, a mask acquisition branch, and a pixel-level keypoint prediction branch, and step S500 includes:
step S510: inputting the image in the marked data set into a convolutional neural network to extract the characteristic information of the image, and transmitting the characteristic information into a regional candidate network;
step S520: the regional candidate network acquires the detection frame of each target workpiece according to the characteristic information and inputs the detection frame to the four branches;
step S530: the classification branch is used for classifying the target workpiece and the background according to the received detection frame; the boundary frame obtaining branch is used for obtaining the coordinates of the preset position point of the boundary frame of each target workpiece according to the received detection frame; the mask obtaining branch is used for obtaining a pixel area where each target workpiece is located according to the received detection frame; the pixel-level key point prediction branch is used for obtaining a unit vector diagram pointing to a preset number of key points according to the received detection frame;
step S540: normalizing the offset of each pixel position and the position of the 2D key point into a unit vector according to the pixel position of each pixel point of the pixel region where each target workpiece is located and the position of the 2D key point;
step S550: acquiring all pixel-level vectors of a single target workpiece, randomly selecting two pixel points, and taking the intersection point of the pixel vectors corresponding to the two pixel points as an initial hypothesis of a 2D key point;
step S560: and repeating the step S550 for a second preset number of times to obtain a group of hypotheses, using a clustering algorithm K-means to obtain a point with the highest score as a pixel point of the key point, and obtaining pixel coordinates of the pixel point as the key point.
Preferably, the loss function preset in step S400 is specifically:
Figure 735245DEST_PATH_IMAGE012
wherein the content of the first and second substances,
Figure 727472DEST_PATH_IMAGE013
Figure 124693DEST_PATH_IMAGE014
Figure 835160DEST_PATH_IMAGE015
Figure 579125DEST_PATH_IMAGE016
weighting factors for the classification branch, the bounding box fetch branch, the mask fetch branch, and the pixel level keypoint prediction branch respectively,
Figure 425859DEST_PATH_IMAGE017
in order to classify the function of the loss,
Figure 229867DEST_PATH_IMAGE018
the loss function is detected for the bounding box,
Figure 693209DEST_PATH_IMAGE019
in order to detect the loss function for the mask,
Figure 240865DEST_PATH_IMAGE020
a branch loss function is predicted for the pixel level keypoints.
Preferably, step S600 includes:
step S610: the method comprises the steps of obtaining 3D coordinates of preset key points in a workpiece model coordinate system, and obtaining a rotation matrix and a translation matrix between the workpiece model coordinate system and a camera coordinate system according to a camera internal reference matrix, the 3D coordinates of the preset key points in the workpiece model coordinate system and pixel coordinates of the preset key points in a real-time image, wherein the method specifically comprises the following steps:
Figure 207684DEST_PATH_IMAGE021
wherein the content of the first and second substances,
Figure 946708DEST_PATH_IMAGE022
representing a rotation matrix between the workpiece model coordinate system to the camera coordinate system,
Figure 366188DEST_PATH_IMAGE023
representing a translation matrix between the workpiece model coordinate system to the camera coordinate system,
Figure 717535DEST_PATH_IMAGE024
is a reference for the camera to be used,
Figure 538860DEST_PATH_IMAGE025
for the pixel coordinates corresponding to the preset key points,
Figure 950250DEST_PATH_IMAGE026
3D coordinates of the preset key points in a workpiece model coordinate system;
step S620: obtaining a conversion matrix between the workpiece model coordinate system and the camera coordinate system according to a rotation matrix and a translation matrix between the workpiece model coordinate system and the camera coordinate system, which comprises the following specific steps:
Figure 857026DEST_PATH_IMAGE027
wherein the content of the first and second substances,
Figure 746485DEST_PATH_IMAGE028
representing a transformation matrix between the workpiece model coordinate system and the camera coordinate system.
Preferably, in step S700, the 6DoF position and posture information of the workpiece in the robot coordinate system is obtained according to the initial pose information of the workpiece, the transformation matrix from the camera coordinate system to the robot coordinate system, and the transformation matrix between the workpiece model coordinate system and the camera coordinate system, and specifically:
Figure 687896DEST_PATH_IMAGE029
wherein the content of the first and second substances,
Figure 768722DEST_PATH_IMAGE030
6DoF position and posture information of the workpiece in the robot coordinate system is shown,
Figure 162794DEST_PATH_IMAGE031
a transformation matrix representing the camera coordinate system to the robot coordinate system,
Figure 855944DEST_PATH_IMAGE032
representing a transformation matrix between the workpiece model coordinate system and the camera coordinate system,
Figure 120703DEST_PATH_IMAGE033
and representing the initial pose information of the workpiece.
The unordered aliasing workpiece grabbing system based on the key point prediction network comprises an image acquisition module, a pose calculation module, a communication module and a pickup module, wherein the image acquisition module is connected with the pose calculation module, the pose calculation module is connected with the pickup module through the communication module,
the image acquisition module is used for acquiring a real-time image and sending the real-time image to the pose calculation module;
the pose calculation module is used for executing the method to obtain the 6DoF position and the posture information of the workpiece in the robot coordinate system and sending the position and the posture information to the pickup module through the communication device;
and the picking module picks the target workpiece according to the received 6DoF position and posture information of the workpiece under the robot coordinate system.
According to the disordered aliasing workpiece grabbing method and system based on the key point prediction network, a real-time RGB image is input, the position of each workpiece can be segmented through a preset key point prediction network model, the position of the key point of each workpiece can be predicted, pixel coordinates of the key point in the image and 3D coordinates of the key point under a workpiece model coordinate system are obtained, a conversion relation between the workpiece model coordinate system and a camera coordinate system is calculated by combining a camera internal parameter matrix, and then 6DoF position and posture information of the workpiece under a robot coordinate system are obtained through solving. Through a preset pixel-level key point prediction network model, the pixel position most possibly representing a key point can be predicted through voting under the condition that the key point is blocked, the problem of pose calculation of the blocked key point under the condition of workpiece aliasing is solved, the robot can realize the workpiece picking function under a more complex scene, the limitation that the traditional feeding method for fixing the workpiece position and teaching and grabbing by the robot is suitable for single scene is eliminated, the feeding and processing of an industrial production line are more flexible, the picking success rate is effectively improved, the system can be popularized and applied to feeding scenes of different parts, and the market prospect is very strong.
Drawings
FIG. 1 is a flowchart of a method for capturing an unordered aliasing workpiece based on a keypoint prediction network according to an embodiment of the invention;
FIG. 2 is a flow chart illustrating the overall architecture of the system according to an embodiment of the present invention;
FIG. 3 is a system hardware platform diagram according to an embodiment of the present invention;
FIG. 4 is a flowchart of a method for capturing an unordered aliased workpiece based on a keypoint prediction network according to another embodiment of the present invention;
FIG. 5 is a schematic diagram of a keypoint prediction network model according to an embodiment of the present invention;
fig. 6 is a schematic diagram of coordinate transformation of the robot arm, the industrial camera, and the target workpiece according to an embodiment of the present invention, where (a) is a schematic diagram of coordinates of the robot arm, (b) is a schematic diagram of coordinates of the industrial camera, (c) is a schematic diagram of coordinates of the target workpiece, and (d) is a schematic diagram of coordinate transformation.
Detailed Description
In order to make the technical solutions of the present invention better understood, the present invention is further described in detail below with reference to the accompanying drawings.
In one embodiment, as shown in fig. 1, the method for capturing an unordered aliasing artifact based on a keypoint prediction network comprises the following steps:
step S100: calibrating and determining a camera internal reference matrix through a Zhang friend method according to a preset first calibration picture;
step S200: determining a conversion matrix from a camera coordinate system to a robot coordinate system by a nine-point calibration method according to a preset second calibration picture;
step S500: acquiring a real-time image, inputting the real-time image into a preset pixel-level key point prediction network model, and regressing to obtain pixel coordinates of preset key points in the real-time image;
step S600: acquiring 3D coordinates of preset key points in a workpiece model coordinate system, and acquiring a conversion matrix between the workpiece model coordinate system and a camera coordinate system according to a camera internal reference matrix, the 3D coordinates of the preset key points in the workpiece model coordinate system and pixel coordinates of the preset key points in a real-time image;
step S700: acquiring coordinates of a picking point of a workpiece to be picked under a workpiece model coordinate system and initial direction information of robot grabbing equipment, acquiring workpiece initial pose information according to the coordinates of the picking point of the workpiece to be picked under the workpiece model coordinate system and the initial direction information of the robot grabbing equipment, and acquiring 6DoF position and posture information of the workpiece under the robot coordinate system according to the workpiece initial pose information, a conversion matrix from a camera coordinate system to the robot coordinate system and a conversion matrix between the workpiece model coordinate system and the camera coordinate system;
step S800: and controlling the robot grabbing equipment to grab the target workpiece according to the 6DoF position and posture information of the workpiece in the robot coordinate system.
Specifically, as shown in fig. 2, a real-time image is acquired based on an image acquisition module, the image acquisition module is composed of a Baumer VCXG-13C monocular RGB camera and a bar light source, the camera has an image resolution of 1280 × 720, is fixed right above a feeding table and is used for acquiring the workpiece image randomly placed by the feeding table, the light sources are two 32cm bar light sources which are respectively installed on two sides of the feeding table, and the angle and the brightness can be adjusted to generate a better imaging effect; and executing steps S100 to S800 based on a pose calculation module, wherein the pose calculation module is realized by an industrial PC (personal computer) carrying server and is divided into an offline part and an online part, the offline part acquires a large number of real scene pictures through a camera to manufacture a training data set, a self-made data set is used for training a pixel-level key point prediction network model, and the trained model is stored as a preset pixel-level key point prediction network model. Acquiring a real-time image by adopting an API (application program interface) of a Baume monocular camera at the online part, inputting the real-time image into a preset pixel-level key Point prediction network model, regressing to obtain pixel coordinates of corresponding key points in the image, and finally resolving the position and attitude of a workpiece based on a PnP (passive-n-Point) algorithm to obtain the position and attitude information of the workpiece under a robot coordinate system; the position and posture information of a workpiece in a robot coordinate system is sent to the robot based on a communication module, the communication module completes data transmission between an industrial PC and the robot, the PC and the robot communicate through a gigabit Ethernet, and the PC transmits the calculated 6DoF pose of the target workpiece in the robot coordinate system to the robot through a TCP/IP; the grabbing of the target workpiece is completed based on a picking module which comprises a robot with an end picker, wherein the robot is a novel cooperative robot Sawyer of Rethink Robotics, and has a single arm 7 freedom degree and is suitable for wide or narrow space. According to the shape characteristics of a grabbed object, a single mechanical arm sucker is selected as a tail end picker, the motion control and track planning tasks of the tail end picker of the robot are completed through a built-in software platform of the robot, firstly, the mechanical arm is located at an initial position, the tail end is controlled to reach the position of a target workpiece through transmitted target workpiece pose information, an air pump is started and kept to achieve picking of the target workpiece, the mechanical arm is controlled to move to a position above a production line conveyor belt and close to the conveyor belt, feeding is achieved by closing the air pump, and then the mechanical arm returns to the initial position to form a round of feeding circulation.
Further, before grabbing, firstly, a camera and robot grabbing hardware equipment are set up, as shown in fig. 3, 1 represents an industrial camera, 2 represents an end picker, 3 represents an evacuation tube, 4 represents a bar light source, 5 represents an industrial part, 6 represents a material fetching table, and 7 represents a sawyer robot, and the method specifically comprises the following steps: a1) fixing the working position of the sawyer robot according to the working space distribution map of the sawyer robot; a2) building an industrial part placing platform, wherein the height is 800mm, the length is 850mm, the width is 340mm, and the industrial part placing platform is positioned 1000mm in front of the fixed position of the robot; a3) horizontally fixing a monocular camera (corresponding to an industrial camera) vertically downwards at a position 610mm above the part placing platform; a4) in order to realize the pickup of the object-oriented mobile phone shell parts, according to the characteristics of regular shape and smooth surface of the mobile phone shell, a vacuum pneumatic sucker is arranged at the tail end of a mechanical arm, and the pickup is realized by controlling the vacuum suction state of a cylinder (corresponding to a vacuum-pumping pipe) through a control switch.
According to the unordered aliasing workpiece grabbing method based on the key point prediction network, a real-time RGB image is input, the position of each workpiece can be segmented through a preset key point prediction network model, the position of the key point of each workpiece can be predicted, pixel coordinates of the key point in the image and 3D coordinates of the key point under a workpiece model coordinate system are obtained, a conversion relation between the workpiece model coordinate system and a camera coordinate system is calculated by combining a camera internal parameter matrix, and then 6DoF position and posture information of the workpiece under a robot coordinate system is obtained through solving. Through a preset pixel-level key point prediction network model, the pixel position most possibly representing a key point can be predicted through voting under the condition that the key point is blocked, the problem of pose calculation of the blocked key point under the condition of workpiece aliasing is solved, the robot can realize the workpiece picking function under a more complex scene, the limitation that the traditional feeding method for fixing the workpiece position and teaching and grabbing by the robot is suitable for single scene is eliminated, the feeding and processing of an industrial production line are more flexible, the picking success rate is effectively improved, the system can be popularized and applied to feeding scenes of different parts, and the market prospect is very strong.
In one embodiment, step S100 includes:
step S110: shooting images of preset first calibration pictures at different angles by using a camera;
step S120: extracting corner information from each image of a preset first calibration picture with different angles;
step S130: and calibrating by using a Zhang-friend method according to the angular point information, and calculating camera internal reference data to obtain a camera internal reference matrix.
In one embodiment, the preset second calibration picture includes nine dots, and the step S200 includes:
step S210: shooting an image of a preset second calibration picture placed in a random posture;
step S220: calculating the circle center pixel position of each dot in the image;
step S230: moving a sucker at the tail end of the robot mechanical arm to each round point, and recording a 3D coordinate under a corresponding robot coordinate system;
step S240: repeating the step S210 to the step S230 for a first preset number of times to obtain a group of 2D-3D data of each dot;
step S250: and calculating a conversion matrix from the camera coordinate system to the robot coordinate system according to the 2D-3D data of each dot.
In one embodiment, step S250 includes:
step S251: and calculating to obtain a rotation matrix between the robot coordinate system and the camera coordinate system and a translation matrix between the robot coordinate system and the camera coordinate system according to the 2D-3D data of each dot, wherein the calculation specifically comprises the following steps:
Figure 873895DEST_PATH_IMAGE034
wherein the content of the first and second substances,
Figure 755264DEST_PATH_IMAGE035
is a scale factor, and is a function of,
Figure 252104DEST_PATH_IMAGE036
representing a rotation matrix between the robot coordinate system to the camera coordinate system,
Figure 902528DEST_PATH_IMAGE037
representing a translation matrix between the robot coordinate system to the camera coordinate system,
Figure 59578DEST_PATH_IMAGE038
is a reference for the camera to be used,
Figure 428242DEST_PATH_IMAGE039
the coordinates of the pixels corresponding to each dot are,
Figure 463194DEST_PATH_IMAGE040
3D coordinates of each dot under a robot coordinate system;
step S252: calculating a transformation matrix from the robot coordinate system to the camera coordinate system according to a rotation matrix from the robot coordinate system to the camera coordinate system and a translation matrix from the robot coordinate system to the camera coordinate system, and specifically:
Figure 233704DEST_PATH_IMAGE041
wherein the content of the first and second substances,
Figure 63120DEST_PATH_IMAGE042
a transformation matrix from a robot coordinate system to a camera coordinate system;
step S253: the method can obtain a conversion matrix from the camera coordinate system to the robot coordinate system according to the conversion matrix from the robot coordinate system to the camera coordinate system, and specifically comprises the following steps:
Figure 184660DEST_PATH_IMAGE043
wherein the content of the first and second substances,
Figure 23303DEST_PATH_IMAGE044
is a transformation matrix from the camera coordinate system to the robot coordinate system.
Specifically, the transformation matrix from the camera coordinate system to the robot coordinate system and the transformation matrix from the robot coordinate system to the camera coordinate system are in an inverse relationship, so that the transformation matrix from the camera coordinate system to the robot coordinate system can be obtained according to the transformation matrix from the robot coordinate system to the camera coordinate system.
In one embodiment, as shown in fig. 4, after step S200 and before step S500, the method further includes:
step S300: and constructing a pixel-level key point prediction network, acquiring a training data set, marking the training data set to obtain a marked data set, and training the pixel-level key point prediction network according to the marked data set to obtain the pixel-level key point prediction network.
Specifically, the method for labeling the training data set comprises the following steps: the iPhone6S mobile phone shell is used as a pickup object, and aiming at network training requirements, a labeling process comprises key point selection, naming rule definition and data storage.
Selecting key points: selecting seven stable key points on the mobile phone shell of the iPhone6S, namely a left upper corner point, a right upper corner point, a left lower corner point, a right lower corner point, two corner points of an apple logo and the center of an 'o' character pattern of the 'iPhone'; the naming rule defines: marking the outline of each mobile phone shell in a scene by data marking software labelme, wherein a label is defined as phone (i), i represents the number of target examples in a picture, key point marking is carried out on each mobile phone shell according to the sequence of the positive direction of the mobile phone shell from top to bottom and from left to right, the label name is defined as phone (i) _ kp (j), j represents that the key point is the jth key point in the mobile phone shell, and when the key point is shielded, the key point position needs to be predicted and marked in the picture; and (3) data storage: the marking information is stored in a json file format, and information such as picture content, mobile phone shell outline, key points and the like is stored in a text form.
In one embodiment, the preset pixel-level keypoint prediction network model includes a convolutional neural network, a region candidate network, and four branches, which are a classification branch, a bounding box fetch branch, a mask fetch branch, and a pixel-level keypoint prediction branch, respectively, and step S500 includes:
step S510: inputting the image in the marked data set into a convolutional neural network to extract the characteristic information of the image, and transmitting the characteristic information into a regional candidate network;
step S520: the regional candidate network acquires the detection frame of each target workpiece according to the characteristic information and inputs the detection frame to the four branches;
step S530: the classification branch is used for classifying the target workpiece and the background according to the received detection frame; the boundary frame obtaining branch is used for obtaining the coordinates of the preset position point of the boundary frame of each target workpiece according to the received detection frame; the mask obtaining branch is used for obtaining a pixel area where each target workpiece is located according to the received detection frame; the pixel-level key point prediction branch is used for obtaining a unit vector diagram pointing to a preset number of key points according to the received detection frame;
step S540: normalizing the offset of each pixel position and the position of the 2D key point into a unit vector according to the pixel position of each pixel point of the pixel region where each target workpiece is located and the position of the 2D key point;
step S550: acquiring all pixel-level vectors of a single target workpiece, randomly selecting two pixel points, and taking the intersection point of the pixel vectors corresponding to the two pixel points as an initial hypothesis of a 2D key point;
step S560: and repeating the step S550 for a second preset number of times to obtain a group of hypotheses, using a clustering algorithm K-means to obtain a point with the highest score as a pixel point of the key point, and obtaining pixel coordinates of the pixel point as the key point.
Specifically, the pixel-level key point prediction network provided by the invention adopts Mask-RCNN as a backbone network, and a pixel-level key point detection branch is added on the basis of example segmentation. The pixel-level key point prediction network is a Convolutional Neural Network (CNN) firstly, is used for extracting the characteristic information of an image, then obtains a detection frame of a target example through a regional candidate network (RPN), finally connects four branches, and respectively performs four regression tasks of classification, boundary frame acquisition, mask acquisition and pixel-level vector calculation, as shown in FIG. 5.
If N pictures are input, N [ hxwxc ] tensor inputs exist, detection frames of each target are obtained through an RPN, and Nx2 tensors are obtained through classification task branches in each detection frame, wherein 2 represents a target workpiece and a background; the detection branch of the bounding box obtains the tensor with the size of Nx 4, and 4 represents four values of coordinates of upper left corner and lower right corner points of the bounding box; obtaining a tensor of Nx (H multiplied by W) through an example mask branch, and expressing a pixel area where each example is located; the key point detection branch obtains a tensor of size N × [ H × W × (K × 2) ] representing a unit vector diagram pointing to K key points.
Under the pixel-level vector calculation regression task, each pixel in the pixel area of each instance (corresponding to the target workpiece) is subjected to
Figure 382740DEST_PATH_IMAGE045
Defining a vector
Figure 147171DEST_PATH_IMAGE046
Representing each pixel location and 2D keypoints
Figure 224849DEST_PATH_IMAGE047
In order to avoid the influence of the size and position of the workpiece and to distinguish different key points, the offset is normalized to a unit vector, specifically:
Figure 867183DEST_PATH_IMAGE048
wherein the content of the first and second substances,
Figure 346706DEST_PATH_IMAGE049
for each pixel location and 2D keypoint
Figure 986766DEST_PATH_IMAGE050
The amount of the offset of (a) is,
Figure 551739DEST_PATH_IMAGE051
is the position coordinates of the key points,
Figure 732185DEST_PATH_IMAGE045
all pixel locations within the pixel area of the target workpiece.
Then, all pixel level vectors of a single instance are obtained by using the mask of the instance, two pixel points are randomly selected, and the intersection point of the two pixel vectors is used as a key point
Figure 564749DEST_PATH_IMAGE051
Initial assumption of
Figure 172448DEST_PATH_IMAGE052
Repeating the above steps J times to obtain a set of hypotheses
Figure 224718DEST_PATH_IMAGE053
And finally, obtaining a point with the highest score, namely the pixel point most probably as the key point by using a clustering algorithm K-means, and obtaining the pixel coordinate of the pixel point as the key point.
Step S300 further includes using the vector angle error to supervise the pixel level keypoint prediction branch for training, with a penalty function defined as:
Figure 943275DEST_PATH_IMAGE054
wherein the content of the first and second substances,
Figure 397390DEST_PATH_IMAGE055
a unit vector of directions indicating that a point i in the pixel region of each target workpiece points to a keypoint k,
Figure 175990DEST_PATH_IMAGE056
Figure 715556DEST_PATH_IMAGE057
respectively representing a prediction vector and a real vector, N representing the number of key points, and M representing the number of pixel points in a single-target workpiece pixel area.
Step S400: calculating the loss value of the pixel-level key point prediction network according to a preset loss function, performing back propagation to update the network parameters of the pixel-level key point prediction network according to the loss value, and obtaining the updated pixel-level key point prediction network as a preset pixel-level key point prediction network model.
Specifically, the network parameters of the pixel-level key point prediction network are updated through back propagation according to the loss values, and when the back propagation stopping condition is met, the updated pixel-level key point prediction network is used as a preset pixel-level key point prediction network model.
In an embodiment, the loss function preset in step S400 is specifically:
Figure 237804DEST_PATH_IMAGE058
wherein the content of the first and second substances,
Figure 310540DEST_PATH_IMAGE059
Figure 260042DEST_PATH_IMAGE060
Figure 21324DEST_PATH_IMAGE061
Figure 347263DEST_PATH_IMAGE062
weighting factors for the classification branch, the bounding box fetch branch, the mask fetch branch, and the pixel level keypoint prediction branch respectively,
Figure 713654DEST_PATH_IMAGE063
in order to classify the function of the loss,
Figure 834056DEST_PATH_IMAGE064
the loss function is detected for the bounding box,
Figure 348214DEST_PATH_IMAGE065
in order to detect the loss function for the mask,
Figure 9003DEST_PATH_IMAGE066
predicting branch loss for pixel level keypointsA function.
Specifically, the penalty function of the pixel-level keypoint prediction branch is described in detail above, and the penalty functions of the other branches are default values in MaskRCNN, classification penalty functions
Figure 525173DEST_PATH_IMAGE067
Using softmax loss, bounding box detects loss
Figure 82056DEST_PATH_IMAGE068
Using smooth L1 loss function, mask generation takes cross entropy as a loss function
Figure 817931DEST_PATH_IMAGE069
In one embodiment, step S600 includes:
step S610: the method comprises the steps of obtaining 3D coordinates of preset key points in a workpiece model coordinate system, and obtaining a rotation matrix and a translation matrix between the workpiece model coordinate system and a camera coordinate system according to a camera internal reference matrix, the 3D coordinates of the preset key points in the workpiece model coordinate system and pixel coordinates of the preset key points in a real-time image, wherein the method specifically comprises the following steps:
Figure 485672DEST_PATH_IMAGE070
wherein the content of the first and second substances,
Figure 357814DEST_PATH_IMAGE071
representing a rotation matrix between the workpiece model coordinate system to the camera coordinate system,
Figure 85598DEST_PATH_IMAGE072
representing a translation matrix between the workpiece model coordinate system to the camera coordinate system,
Figure 43190DEST_PATH_IMAGE073
is a reference for the camera to be used,
Figure 249043DEST_PATH_IMAGE074
for the pixel coordinates corresponding to the preset key points,
Figure 739805DEST_PATH_IMAGE075
3D coordinates of the preset key points in a workpiece model coordinate system;
step S620: obtaining a conversion matrix between the workpiece model coordinate system and the camera coordinate system according to a rotation matrix and a translation matrix between the workpiece model coordinate system and the camera coordinate system, which comprises the following specific steps:
Figure 638491DEST_PATH_IMAGE076
wherein the content of the first and second substances,
Figure 83379DEST_PATH_IMAGE077
representing a transformation matrix between the workpiece model coordinate system and the camera coordinate system.
Specifically, a 2D key point coordinate [ u, v ] of the workpiece is obtained from the key point prediction network model, an assumption generated by the key points on the surface of the workpiece is selected, a mobile phone shell model coordinate system is established, and coordinates of the key points in the workpiece model coordinate system are obtained, as shown in table 1:
TABLE 1
Figure 92923DEST_PATH_IMAGE079
The 2D-3D correspondence of the keypoints can then be obtained.
Since the conversion relationship between the robot coordinate system and the camera coordinate system is fixed, the step S200 obtains T through the hand-eye calibrationc2rThe 3D coordinates of each key point in the workpiece model coordinate system are fixed as shown in table 1, the pixel coordinates of the key point in the image and the 3D coordinates of the key point in the workpiece model coordinate system are obtained through the key point prediction in step S500, and the conversion relation T between the workpiece model coordinate system and the camera coordinate system is calculated through the PnP algorithmo2c
Figure 205236DEST_PATH_IMAGE081
The posture conversion relationship is shown in fig. 6, and the posture of the workpiece in the robot coordinate system can be obtained.
In one embodiment, in step S700, the 6DoF position and posture information of the workpiece in the robot coordinate system is obtained according to the initial pose information of the workpiece, the transformation matrix from the camera coordinate system to the robot coordinate system, and the transformation matrix between the workpiece model coordinate system and the camera coordinate system, and specifically:
Figure 9244DEST_PATH_IMAGE082
wherein the content of the first and second substances,
Figure 207007DEST_PATH_IMAGE083
6DoF position and posture information of the workpiece in the robot coordinate system is shown,
Figure 551400DEST_PATH_IMAGE084
a transformation matrix representing the camera coordinate system to the robot coordinate system,
Figure 518219DEST_PATH_IMAGE085
representing a transformation matrix between the workpiece model coordinate system and the camera coordinate system,
Figure 991664DEST_PATH_IMAGE086
and representing the initial pose information of the workpiece.
Specifically, a 3D coordinate system is created for the workpiece model, so that the 3D coordinates of each key point in the workpiece model coordinate system can be obtained by measuring the workpiece model data
Figure 676723DEST_PATH_IMAGE087
. Obtaining the pixel coordinates corresponding to each key point in the real-time image by the key point prediction network
Figure 28070DEST_PATH_IMAGE088
According to the PnP algorithm, the model coordinate system and the camera coordinate system can be calculatedInter-conversion matrix
Figure 849395DEST_PATH_IMAGE089
Selecting the geometric center of the mobile phone shell as a picking point, wherein the coordinate of the center under the coordinate system of the workpiece model is
Figure 260785DEST_PATH_IMAGE090
(ii) a Transformation matrix of known workpiece model to camera coordinate system
Figure 167561DEST_PATH_IMAGE091
And a transformation matrix of the camera to robot coordinate system
Figure 791441DEST_PATH_IMAGE092
Defining the initial direction of the workpiece as the direction coordinate of the terminal picker in the robot coordinate system in the vertical picking state of the robot when the mobile phone shell is horizontally placed, representing the deflection angle (Rx, Ry, Rz) of the current mechanical arm terminal picker around the three axes of the robot reference coordinate system x, y and z, and representing the deflection angle as
Figure 732852DEST_PATH_IMAGE093
The initial 6DoF information of the workpiece is defined as
Figure 548099DEST_PATH_IMAGE094
Obtaining the conversion relation between the camera coordinate system and the robot coordinate system through the calibration of hands and eyes
Figure 942171DEST_PATH_IMAGE095
And a transformation matrix between the workpiece model coordinate system and the camera coordinate system
Figure 635321DEST_PATH_IMAGE096
Obtaining the 6DoF position and the attitude information of the workpiece under the robot coordinate system
Figure DEST_PATH_IMAGE097
. The robot picks up the workpiece according to the 6DoF position and posture information of the workpiece, wherein the 6DoF comprises position information with 3 degrees of freedom and direction information with 3 degrees of freedomTherefore, the robot can realize accurate pickup at different angles. Therefore the cell-phone shell is put with any kind of gesture no matter, can calculate all the time that the cell-phone shell is different to put under the gesture, and the terminal pick-up point of robot picks up the direction coordinate of cell-phone shell perpendicularly, combines, realizes that 6DoF picks up.
According to the disordered aliasing workpiece grabbing method based on the key point prediction network, a pixel-level vector calculation branch is introduced into the key point prediction network, and the vector calculated by each pixel point represents the direction of the pixel point pointing to the key point, so that according to the consistency of the directions, under the condition that the key point of the workpiece is shielded, the pixel position most possibly representing the key point can be obtained through voting, and the problem of pose calculation of the shielded key point under the condition that the workpiece is aliased is solved; secondly, the pose calculation is divided into two stages, firstly 2D-3D key points are matched through a key point prediction network, then the pose calculation is realized through PnP, the data set of the invention is only needed to be manufactured by marking the outline of the target workpiece and the key points of the workpiece in the picture, the pose information of each workpiece is not needed to be calculated during marking, therefore, compared with the data set, the data set is simple and convenient to manufacture, the PnP method only needs more than four groups of key points to solve the data, seven key points are determined by the method, therefore, the workpiece pose can be solved by only carrying out PnP calculation on seven groups of point pairs, the characteristics of light weight data calculation are reflected, and the problems that the existing method for calculating the workpiece pose by using deep neural network regression needs to label the 6D pose of training data, the labeling work is very difficult, and the network regression pose uses a violent matching method, so that the calculation amount is large and the time is consumed are solved; finally, the invention uses 6DoF pose information when picking up the workpieces, increases the calculation of direction information, and enables the robot to grab in the initially defined relative posture (the end picker of the robot is vertical to the plane of the workpieces) when picking up the workpieces, thereby enabling the grabbing to be more accurate and improving the success rate.
In one embodiment, the unordered aliasing workpiece grabbing system based on the key point prediction network comprises an image acquisition module, a pose calculation module, a communication module and a pickup module, wherein the image acquisition module is connected with the pose calculation module, the pose calculation module is connected with the pickup module through the communication module, and the image acquisition module is used for acquiring a real-time image and sending the real-time image to the pose calculation module; the pose calculation module is used for executing a disordered aliasing workpiece grabbing method based on a key point prediction network to obtain 6DoF position and posture information of a workpiece in a robot coordinate system and sending the position and posture information to the pickup module through the communication device; and the picking module picks the target workpiece according to the received 6DoF position and posture information of the workpiece under the robot coordinate system.
For specific definition of the unordered aliasing artifact fetching system based on the keypoint prediction network, reference may be made to the above definition of the unordered aliasing artifact fetching method based on the keypoint prediction network, and details are not repeated here.
The method and the system for capturing the disordered aliasing workpiece based on the key point prediction network are described in detail above. The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the core concepts of the present invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims (10)

1. The unordered aliasing workpiece grabbing method based on the key point prediction network is characterized by comprising the following steps:
step S100: calibrating and determining a camera internal reference matrix through a Zhang friend method according to a preset first calibration picture;
step S200: determining a conversion matrix from a camera coordinate system to a robot coordinate system by a nine-point calibration method according to a preset second calibration picture;
step S500: acquiring a real-time image, inputting the real-time image into a preset pixel-level key point prediction network model, and regressing to obtain pixel coordinates of preset key points in the real-time image;
step S600: acquiring 3D coordinates of preset key points in a workpiece model coordinate system, and obtaining a conversion matrix between the workpiece model coordinate system and a camera coordinate system according to the camera internal reference matrix, the 3D coordinates of the preset key points in the workpiece model coordinate system and pixel coordinates of the preset key points in the real-time image;
step S700: acquiring coordinates of a picking point of a workpiece to be picked under a workpiece model coordinate system and initial direction information of robot grabbing equipment, acquiring workpiece initial pose information according to the coordinates of the picking point of the workpiece to be picked under the workpiece model coordinate system and the initial direction information of the robot grabbing equipment, and acquiring 6DoF position and posture information of the workpiece under the robot coordinate system according to the workpiece initial pose information, a conversion matrix from a camera coordinate system to the robot coordinate system and a conversion matrix between the workpiece model coordinate system and the camera coordinate system;
step S800: and controlling robot grabbing equipment to grab the target workpiece according to the 6DoF position and posture information of the workpiece in the robot coordinate system.
2. The method according to claim 1, wherein step S100 comprises:
step S110: shooting images of preset first calibration pictures at different angles by using a camera;
step S120: extracting corner information from each image of a preset first calibration picture with different angles;
step S130: and calibrating by using a Zhang-friend method according to the corner information, and calculating camera internal reference data to obtain a camera internal reference matrix.
3. The method according to claim 2, wherein the predetermined second calibration picture includes nine dots, and the step S200 includes:
step S210: shooting an image of a preset second calibration picture placed in a random posture;
step S220: calculating the circle center pixel position of each dot in the image of the preset second calibration picture in random posture arrangement;
step S230: moving a mechanical arm tail end sucker of the robot grabbing equipment to each round point, and recording a 3D coordinate under a corresponding robot coordinate system;
step S240: repeating the step S210 to the step S230 for a first preset number of times to obtain a group of 2D-3D data of each dot;
step S250: and calculating a conversion matrix from the camera coordinate system to the robot coordinate system according to the 2D-3D data of each dot.
4. The method of claim 3, wherein step S250 comprises:
step S251: and calculating to obtain a rotation matrix between the robot coordinate system and the camera coordinate system and a translation matrix between the robot coordinate system and the camera coordinate system according to the 2D-3D data of each dot, wherein the calculation specifically comprises the following steps:
Figure 374508DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 760490DEST_PATH_IMAGE002
is a scale factor, and is a function of,
Figure 725515DEST_PATH_IMAGE003
representing a rotation matrix between the robot coordinate system to the camera coordinate system,
Figure 589566DEST_PATH_IMAGE004
representing a translation matrix between the robot coordinate system to the camera coordinate system,
Figure 872779DEST_PATH_IMAGE005
is a reference for the camera to be used,
Figure 429663DEST_PATH_IMAGE006
the coordinates of the pixels corresponding to each dot are,
Figure 165537DEST_PATH_IMAGE007
3D coordinates of each dot in a workpiece model coordinate system;
step S252: calculating to obtain a conversion matrix from the robot coordinate system to the camera coordinate system according to the rotation matrix from the robot coordinate system to the camera coordinate system and the translation matrix from the robot coordinate system to the camera coordinate system, specifically:
Figure 833279DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 970999DEST_PATH_IMAGE009
a transformation matrix from a robot coordinate system to a camera coordinate system;
step S253: obtaining a transformation matrix from the camera coordinate system to the robot coordinate system according to the transformation matrix from the robot coordinate system to the camera coordinate system, which specifically comprises the following steps:
Figure 433205DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure 154911DEST_PATH_IMAGE011
is a transformation matrix from the camera coordinate system to the robot coordinate system.
5. The method of claim 1, wherein after step S200 and before step S500, further comprising:
step S300: building a pixel-level key point prediction network, acquiring a training data set, marking the training data set to obtain a marked data set, and training the pixel-level key point prediction network according to the marked data set to obtain a pixel-level key point prediction network;
step S400: calculating the loss value of the pixel-level key point prediction network according to a preset loss function, and performing back propagation to update the network parameters of the pixel-level key point prediction network according to the loss value to obtain an updated pixel-level key point prediction network as a preset pixel-level key point prediction network model.
6. The method according to claim 1, wherein the preset pixel-level keypoint prediction network model comprises a convolutional neural network, a region candidate network, and four branches, which are a classification branch, a bounding box fetch branch, a mask fetch branch, and a pixel-level keypoint prediction branch, respectively, and step S500 comprises:
step S510: inputting the real-time image into the convolutional neural network to extract the characteristic information of the image, and transmitting the characteristic information into the regional candidate network;
step S520: the regional candidate network acquires a detection frame of each target workpiece according to the characteristic information and inputs the detection frame to the four branches;
step S530: the classification branch is used for classifying the target workpiece and the background according to the received detection frame; the boundary frame obtaining branch is used for obtaining the coordinates of the preset position point of the boundary frame of each target workpiece according to the received detection frame; the mask obtaining branch is used for obtaining a pixel area where each target workpiece is located according to the received detection frame; the pixel-level key point prediction branch is used for obtaining a unit vector diagram pointing to a preset number of key points according to the received detection frame;
step S540: normalizing the offset of each pixel position and the position of the 2D key point into a unit vector according to the pixel position of each pixel point of the pixel region where each target workpiece is located and the position of the 2D key point;
step S550: acquiring all pixel-level vectors of a single target workpiece, randomly selecting two pixel points, and taking the intersection point of the pixel vectors corresponding to the two pixel points as an initial hypothesis of a 2D key point;
step S560: and repeating the step S550 for a second preset number of times to obtain a group of hypotheses, obtaining a pixel point with the highest point as a key point by using a clustering algorithm K-means for the group of hypotheses, and obtaining the pixel coordinate of the pixel point as the key point.
7. The method according to claim 6, wherein the loss function preset in step S400 is specifically:
Figure 564027DEST_PATH_IMAGE012
wherein the content of the first and second substances,
Figure 556253DEST_PATH_IMAGE013
Figure 454939DEST_PATH_IMAGE014
Figure 165406DEST_PATH_IMAGE015
Figure 174951DEST_PATH_IMAGE016
weighting factors for the classification branch, the bounding box fetch branch, the mask fetch branch, and the pixel level keypoint prediction branch respectively,
Figure 287263DEST_PATH_IMAGE017
in order to classify the function of the loss,
Figure 795998DEST_PATH_IMAGE018
the loss function is detected for the bounding box,
Figure 993761DEST_PATH_IMAGE019
in order to detect the loss function for the mask,
Figure 541417DEST_PATH_IMAGE020
a branch loss function is predicted for the pixel level keypoints.
8. The method of claim 1, wherein step S600 comprises:
step S610: acquiring a 3D coordinate of a preset key point in a workpiece model coordinate system, and obtaining a rotation matrix and a translation matrix between the workpiece model coordinate system and a camera coordinate system according to the camera internal reference matrix, the 3D coordinate of the preset key point in the workpiece model coordinate system and a pixel coordinate of the preset key point in the real-time image, wherein the method specifically comprises the following steps:
Figure 508236DEST_PATH_IMAGE021
wherein the content of the first and second substances,
Figure 748725DEST_PATH_IMAGE022
representing a rotation matrix between the workpiece model coordinate system to the camera coordinate system,
Figure 168205DEST_PATH_IMAGE023
representing a translation matrix between the workpiece model coordinate system to the camera coordinate system,
Figure 785131DEST_PATH_IMAGE024
is a reference for the camera to be used,
Figure 104992DEST_PATH_IMAGE025
for the pixel coordinates corresponding to the preset key points,
Figure 516381DEST_PATH_IMAGE026
3D coordinates of the preset key points in a workpiece model coordinate system;
step S620: obtaining a conversion matrix between the workpiece model coordinate system and the camera coordinate system according to the rotation matrix and the translation matrix between the workpiece model coordinate system and the camera coordinate system, which specifically comprises the following steps:
Figure 423158DEST_PATH_IMAGE027
wherein the content of the first and second substances,
Figure 312616DEST_PATH_IMAGE028
representing a transformation matrix between the workpiece model coordinate system and the camera coordinate system.
9. The method according to claim 1, wherein in step S700, the 6DoF position and posture information of the workpiece in the robot coordinate system is obtained according to the initial pose information of the workpiece, the transformation matrix from the camera coordinate system to the robot coordinate system, and the transformation matrix between the workpiece model coordinate system and the camera coordinate system, specifically:
Figure 254027DEST_PATH_IMAGE029
wherein the content of the first and second substances,
Figure 39581DEST_PATH_IMAGE030
6DoF position and posture information of the workpiece in the robot coordinate system is shown,
Figure 433653DEST_PATH_IMAGE031
a transformation matrix representing the camera coordinate system to the robot coordinate system,
Figure 893847DEST_PATH_IMAGE032
representing a transformation matrix between the workpiece model coordinate system and the camera coordinate system,
Figure 627447DEST_PATH_IMAGE033
and representing the initial pose information of the workpiece.
10. The unordered aliasing workpiece grabbing system based on the key point prediction network is characterized by comprising an image acquisition module, a pose calculation module, a communication module and a picking module, wherein the image acquisition module is connected with the pose calculation module, the pose calculation module is connected with the picking module through the communication module,
the image acquisition module is used for acquiring a real-time image and sending the real-time image to the pose calculation module;
the pose calculation module is used for executing the method of any one of claims 1 to 9 to obtain 6DoF position and posture information of the workpiece in the robot coordinate system and sending the 6DoF position and posture information to the pickup module through the communication device;
and the picking module picks a target workpiece according to the received 6DoF position and posture information of the workpiece in the robot coordinate system.
CN202111156483.0A 2021-09-30 2021-09-30 Unordered aliasing workpiece grabbing method and system based on key point prediction network Active CN113580149B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111156483.0A CN113580149B (en) 2021-09-30 2021-09-30 Unordered aliasing workpiece grabbing method and system based on key point prediction network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111156483.0A CN113580149B (en) 2021-09-30 2021-09-30 Unordered aliasing workpiece grabbing method and system based on key point prediction network

Publications (2)

Publication Number Publication Date
CN113580149A true CN113580149A (en) 2021-11-02
CN113580149B CN113580149B (en) 2021-12-21

Family

ID=78242657

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111156483.0A Active CN113580149B (en) 2021-09-30 2021-09-30 Unordered aliasing workpiece grabbing method and system based on key point prediction network

Country Status (1)

Country Link
CN (1) CN113580149B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114714365A (en) * 2022-06-08 2022-07-08 湖南大学 Disordered workpiece grabbing method and system based on cloud platform
CN114998804A (en) * 2022-06-14 2022-09-02 湖南大学 Posture-gesture overall posture capturing method based on two stages
CN117067219A (en) * 2023-10-13 2023-11-17 广州朗晴电动车有限公司 Sheet metal mechanical arm control method and system for trolley body molding
CN117140558A (en) * 2023-10-25 2023-12-01 菲特(天津)检测技术有限公司 Coordinate conversion method, system and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9102055B1 (en) * 2013-03-15 2015-08-11 Industrial Perception, Inc. Detection and reconstruction of an environment to facilitate robotic interaction with the environment
CN109108965A (en) * 2018-07-27 2019-01-01 武汉精锋微控科技有限公司 A kind of cartesian space motion forecast method applied to mechanical arm
CN109986560A (en) * 2019-03-19 2019-07-09 埃夫特智能装备股份有限公司 A kind of mechanical arm self-adapting grasping method towards multiple target type
CN112109086A (en) * 2020-09-03 2020-12-22 清华大学深圳国际研究生院 Grabbing method for industrial stacked parts, terminal equipment and readable storage medium
CN112936257A (en) * 2021-01-22 2021-06-11 熵智科技(深圳)有限公司 Workpiece grabbing method and device, computer equipment and storage medium
DE102020103398A1 (en) * 2020-02-11 2021-08-12 Heidelberger Druckmaschinen Aktiengesellschaft Method for moving a stack of products with a robot
CN113334395A (en) * 2021-08-09 2021-09-03 常州唯实智能物联创新中心有限公司 Multi-clamp mechanical arm disordered grabbing method and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9102055B1 (en) * 2013-03-15 2015-08-11 Industrial Perception, Inc. Detection and reconstruction of an environment to facilitate robotic interaction with the environment
CN109108965A (en) * 2018-07-27 2019-01-01 武汉精锋微控科技有限公司 A kind of cartesian space motion forecast method applied to mechanical arm
CN109986560A (en) * 2019-03-19 2019-07-09 埃夫特智能装备股份有限公司 A kind of mechanical arm self-adapting grasping method towards multiple target type
DE102020103398A1 (en) * 2020-02-11 2021-08-12 Heidelberger Druckmaschinen Aktiengesellschaft Method for moving a stack of products with a robot
CN112109086A (en) * 2020-09-03 2020-12-22 清华大学深圳国际研究生院 Grabbing method for industrial stacked parts, terminal equipment and readable storage medium
CN112936257A (en) * 2021-01-22 2021-06-11 熵智科技(深圳)有限公司 Workpiece grabbing method and device, computer equipment and storage medium
CN113334395A (en) * 2021-08-09 2021-09-03 常州唯实智能物联创新中心有限公司 Multi-clamp mechanical arm disordered grabbing method and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114714365A (en) * 2022-06-08 2022-07-08 湖南大学 Disordered workpiece grabbing method and system based on cloud platform
CN114998804A (en) * 2022-06-14 2022-09-02 湖南大学 Posture-gesture overall posture capturing method based on two stages
CN117067219A (en) * 2023-10-13 2023-11-17 广州朗晴电动车有限公司 Sheet metal mechanical arm control method and system for trolley body molding
CN117067219B (en) * 2023-10-13 2023-12-15 广州朗晴电动车有限公司 Sheet metal mechanical arm control method and system for trolley body molding
CN117140558A (en) * 2023-10-25 2023-12-01 菲特(天津)检测技术有限公司 Coordinate conversion method, system and electronic equipment
CN117140558B (en) * 2023-10-25 2024-01-16 菲特(天津)检测技术有限公司 Coordinate conversion method, system and electronic equipment

Also Published As

Publication number Publication date
CN113580149B (en) 2021-12-21

Similar Documents

Publication Publication Date Title
CN113580149B (en) Unordered aliasing workpiece grabbing method and system based on key point prediction network
CN108885459B (en) Navigation method, navigation system, mobile control system and mobile robot
CN112070818B (en) Robot disordered grabbing method and system based on machine vision and storage medium
CN109255813B (en) Man-machine cooperation oriented hand-held object pose real-time detection method
CN109074083B (en) Movement control method, mobile robot, and computer storage medium
CN111738261B (en) Single-image robot unordered target grabbing method based on pose estimation and correction
CN108656107B (en) Mechanical arm grabbing system and method based on image processing
CN110580725A (en) Box sorting method and system based on RGB-D camera
CN111368852A (en) Article identification and pre-sorting system and method based on deep learning and robot
CN110084243B (en) File identification and positioning method based on two-dimensional code and monocular camera
CN108942923A (en) A kind of mechanical arm crawl control method
JP2020512646A (en) Imaging system for localization and mapping of scenes containing static and dynamic objects
CN114912287A (en) Robot autonomous grabbing simulation system and method based on target 6D pose estimation
CN114714365B (en) Disordered workpiece grabbing method and system based on cloud platform
CN112927264B (en) Unmanned aerial vehicle tracking shooting system and RGBD tracking method thereof
CN113276106A (en) Climbing robot space positioning method and space positioning system
CN115816460B (en) Mechanical arm grabbing method based on deep learning target detection and image segmentation
CN114882109A (en) Robot grabbing detection method and system for sheltering and disordered scenes
CN115147488B (en) Workpiece pose estimation method and grabbing system based on dense prediction
CN113034575A (en) Model construction method, pose estimation method and object picking device
CN112949452A (en) Robot low-light environment grabbing detection method based on multitask shared network
CN114782628A (en) Indoor real-time three-dimensional reconstruction method based on depth camera
Chang et al. GhostPose: Multi-view pose estimation of transparent objects for robot hand grasping
US20210304411A1 (en) Map construction method, apparatus, storage medium and electronic device
CN114193440A (en) Robot automatic grabbing system and method based on 3D vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant