CN116587280A - Robot 3D laser vision disordered grabbing control method, medium and system - Google Patents

Robot 3D laser vision disordered grabbing control method, medium and system Download PDF

Info

Publication number
CN116587280A
CN116587280A CN202310674934.2A CN202310674934A CN116587280A CN 116587280 A CN116587280 A CN 116587280A CN 202310674934 A CN202310674934 A CN 202310674934A CN 116587280 A CN116587280 A CN 116587280A
Authority
CN
China
Prior art keywords
workpiece
image
grabbing
robot
pose
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310674934.2A
Other languages
Chinese (zh)
Inventor
吴清生
黄华飞
王安丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Yunhua Intelligent Equipment Co ltd
Original Assignee
Anhui Yunhua Intelligent Equipment Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Yunhua Intelligent Equipment Co ltd filed Critical Anhui Yunhua Intelligent Equipment Co ltd
Priority to CN202310674934.2A priority Critical patent/CN116587280A/en
Publication of CN116587280A publication Critical patent/CN116587280A/en
Pending legal-status Critical Current

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1679Programme controls characterised by the tasks executed
    • B25J9/1692Calibration of manipulator
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1694Programme controls characterised by use of sensors other than normal servo-feedback from position, speed or acceleration sensors, perception control, multi-sensor controlled systems, sensor fusion
    • B25J9/1697Vision controlled systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Manipulator (AREA)

Abstract

The invention provides a robot 3D laser vision disordered grabbing control method, medium and system, which belong to the technical field of machine grabbing, and comprise the following steps: acquiring a three-dimensional diagram of a robot, a clamp and a material frame; the three-dimensional model of the imported workpiece is compared with the 3D image shot by the camera, and the pose of the workpiece is identified; acquiring a pose conversion relation between the robot and the 3D camera, and completing hand-eye calibration between the robot and the 3D camera; acquiring a 2D image, a depth image and a point cloud view of a target workpiece stack obtained by a 3D camera, and forming a workpiece stack fusion image; according to a pre-trained unordered grabbing model, calculating a workpiece pile fusion image to obtain an optimal grabbing workpiece and a pose thereof; the robot is controlled to grasp the workpiece according to the optimal grasping workpiece and the pose thereof; the method, the medium and the system can solve the technical problem that the grabbing efficiency is low due to poor grabbing instantaneity when the robot performs unordered grabbing by utilizing 3D laser vision.

Description

Robot 3D laser vision disordered grabbing control method, medium and system
Technical Field
The invention belongs to the technical field of machine grabbing, and particularly relates to a robot 3D laser vision disordered grabbing control method, medium and system.
Background
So far, on the basis of the machine vision technology, the research on the industrial robot is more and more extensive, and high attention is paid, more scientific and technical achievements are presented, and better application is achieved in actual production, but because the automatic application speed presents an ascending situation, the defects and defects of the traditional industrial production mode are increasingly revealed, and the modern industrial production needs are difficult to meet. Some businesses also sort products by manual methods, such as sealing boxes, boxing, handling, etc. In order to reduce the labor intensity of workers, the machine vision technology for improving the production benefit becomes an optimal application mode, and the traditional operation method is effectively improved and promoted. Nowadays, more and more industrial robot systems integrate visual interfaces, and the visual technology is matched with a robot to realize the positioning, classification and detection of a target object. The core of the machine vision is to process the image collected by the camera, extract the characteristic information of the image, judge the upper layer semantic of the image, further replace the human eyes and the human brain to complete the appointed task, and even complete the task which can not be completed by the human eyes in the special environment.
In the actual production process, the characteristics of the shape, the size, the surface texture and the like of the objects are greatly different, and meanwhile, the placement state of the objects also has the characteristic of disorder. The traditional grabbing control method often needs to model an object in advance and grab the object through a preset grabbing strategy. However, when the method is used for dealing with unordered objects, the grabbing efficiency is low, and the requirements on the shape and the placement state of the objects are high, so that the method is limited in practical application.
In order to solve the above problems, researchers have begun exploring the use of computer vision techniques to assist in achieving the control of the gripping of the robot. The computer vision technology can acquire the position, the posture and other information of the object through the perception and the analysis of the scene, thereby providing basis for the grabbing of the robot. However, conventional computer vision techniques are mainly based on two-dimensional image processing, and have limited capturing effects on objects in three-dimensional space. Therefore, how to use the three-dimensional vision technology to realize the rapid and accurate grabbing of disordered objects becomes a problem to be solved in the field of robot grabbing control.
In recent years, three-dimensional laser vision techniques have been widely focused and studied. The three-dimensional laser vision technology obtains three-dimensional point cloud data of an object through a laser scanner, and then identifies and positions the object through a point cloud processing algorithm. Compared with the traditional two-dimensional vision technology, the three-dimensional laser vision technology has higher precision in the aspect of space information acquisition and has greater potential for realizing disordered grabbing of the robot.
However, the existing robot grabbing control method based on the three-dimensional laser vision technology still has some defects. Firstly, in the aspect of point cloud data processing, the existing method often adopts a complex calculation method and model, so that the calculated amount is large, and the real-time performance is poor. Secondly, in the aspect of grabbing strategies, the existing method mainly depends on the characteristics and rules of manual design, the generalization capability of the grabbing strategies is weak, and the grabbing effect on different types of objects is limited. In addition, when the existing method is used for processing unordered objects, multiple grabbing attempts are needed, and grabbing efficiency is low.
Disclosure of Invention
In view of the above, the invention provides a control method, medium and system for unordered grabbing of a robot 3D laser vision, which can solve the technical problem of lower grabbing efficiency caused by poor grabbing instantaneity when the robot utilizes the 3D laser vision to unordered grab.
The invention is realized in the following way:
the first aspect of the invention provides a robot 3D laser vision disordered grabbing control method, which comprises the following steps:
s10, acquiring a three-dimensional diagram of a robot, a clamp and a material frame;
s20, comparing the three-dimensional model of the imported workpiece with a 3D image shot by a camera, and identifying the pose of the workpiece;
S30, acquiring a pose conversion relation between the robot and the 3D camera, and completing hand-eye calibration between the robot and the 3D camera;
s40, acquiring a 2D image, a depth image and a point cloud view of a target workpiece stack obtained by a 3D camera, and forming a workpiece stack fusion image;
s50, calculating a workpiece pile fusion image according to a pre-trained unordered grabbing model to obtain an optimal grabbing workpiece and a pose thereof;
s60, controlling a robot to grasp the workpiece according to the optimal grasping workpiece and the pose thereof;
and when the optimal workpiece grabbing is the workpiece grabbing, the workpiece with the least influence on the pose of other workpieces is grabbed.
On the basis of the technical scheme, the robot 3D laser vision disordered grabbing control method can be further improved as follows:
the step of comparing the three-dimensional model of the imported workpiece with the 3D image shot by the camera to identify the pose of the workpiece specifically comprises the following steps:
preprocessing a three-dimensional model of a workpiece to obtain a point cloud representation;
extracting the features of the 3D image shot by the camera and the imported workpiece three-dimensional model by adopting a PFH or FPFH algorithm;
performing feature matching, and adopting nearest neighbor searching and a random sample consistency algorithm;
Estimating a pose transformation matrix between the feature matching point pairs;
and applying the estimated pose transformation matrix to the imported workpiece three-dimensional model to obtain the pose of the workpiece three-dimensional model in the 3D image shot by the camera.
The method comprises the steps of acquiring the pose conversion relation between the robot and the 3D camera and completing the hand-eye calibration between the robot and the 3D camera, and specifically comprises the following steps:
setting parameters of a 3D camera and a calibration plate;
controlling the 3D camera to shoot and collect calibration plate data of different poses;
calculating a calibration result according to the added calibration point column;
and optimizing and reducing errors of the calculation result according to the calibration precision, and finally obtaining the pose conversion relation between the robot and the 3D camera to finish the hand-eye calibration between the robot and the 3D camera.
The step of acquiring a 2D image, a depth image and a point cloud view of a target workpiece stack obtained by a 3D camera and forming a workpiece stack fusion image specifically comprises the following steps:
acquiring a 2D image, a depth image and a point cloud view of a target workpiece stack obtained by a 3D camera;
registering the 2D image and the depth image to obtain a depth image corresponding to the 2D image;
registering the point cloud data with the 2D image, and endowing color information to the point cloud data;
And carrying out downsampling and filtering processing on the fused point cloud data so as to reduce data quantity and noise.
The method specifically comprises the steps of establishing and training the unordered grabbing model, wherein the step specifically comprises the following steps of:
constructing a training sample comprising a plurality of pairs of workpiece pile images; the workpiece pile image pair comprises a workpiece pile fusion image, which is marked as a first image, and a workpiece pile fusion image after one workpiece is randomly taken out in the first image, which is marked as a second image;
and establishing an unordered grabbing model prototype by using the convolutional neural network, and training by using a training sample to obtain an unordered grabbing model.
The method specifically comprises the following steps of controlling a robot to grasp a workpiece according to the optimal workpiece grasping and the pose thereof:
step 1: recording the workpiece stack fusion image as a first fusion image;
step 2: controlling the robot to grasp the optimal grasping workpiece in the first fusion image;
step 3: after the optimal grabbing of the workpieces is completed, acquiring a second fusion image of the workpiece stack at the moment;
step 4: calculating the similarity of the second fusion image and the first fusion image, if the similarity is greater than or equal to a grabbing threshold value, deleting the image obtained by optimally grabbing the workpiece from the first fusion image as the first fusion image, and calculating the optimally grabbing workpiece of the first fusion image;
Step 5: and (3) iteratively executing the steps 2-4 until the similarity is smaller than the grabbing threshold, taking the current image as a first fusion image, calculating the optimal grabbing workpiece of the first fusion image, and iteratively executing the steps 2-5.
Further, in step 5, if the loop of step 2-4 is executed iteratively for more than 10-20 times, the current image is used as the first fused image, the optimal workpiece to be grasped of the first fused image is calculated, and step 2-step 5 is executed iteratively.
The beneficial effects of adopting above-mentioned improvement scheme are: the setting of the step can set a loop iteration maximum value, and the condition that the grabbing error is gradually increased due to the fact that a fixed image is always adopted as a first fusion image is prevented.
Further, the step 2 of controlling the robot to grasp the optimal grasping workpiece in the first fused image further includes a step of accurately determining a pose of the optimal grasping workpiece, and specifically includes:
deleting the part except the optimal grabbing workpiece in the first fusion image, and only reserving the optimal grabbing workpiece as an image to be grabbed;
and carrying out pose recognition on the optimal grabbing workpiece to obtain the pose of the optimal grabbing workpiece.
A second aspect of the present invention provides a computer readable storage medium, where the computer readable storage medium stores program instructions, where the program instructions are executed to perform a method for controlling robot 3D laser vision disorder grabbing as described above.
A third aspect of the present invention provides a robotic 3D laser vision chaotic grasping control system, comprising the computer readable storage medium described above.
Compared with the prior art, the robot 3D laser vision disordered grabbing control method, medium and system provided by the invention have the beneficial effects that: firstly, a disordered grabbing model is established by utilizing a convolutional neural network, an optimal grabbing workpiece and the pose thereof can be obtained by carrying out operation on a workpiece pile fusion image, at the moment, a robot is controlled to grab the optimal grabbing workpiece, after grabbing, the fusion image of the rest workpiece pile has small change, namely, most of the workpieces do not have pose change, at the moment, the grabbed workpiece is deleted from the fusion image as a new image, and the new image is calculated to find the optimal grabbing workpiece, and in the process, as the fusion image change before and after grabbing is extremely small, the grabbed workpiece can be directly used for searching the optimal grabbing workpiece after the process of deleting the grabbed workpiece by using the image before grabbing in a memory; compared with the method for capturing the fused image by taking a photo each time, the method for capturing the workpiece in the fused image has the advantages of reduced image processing amount, higher processing efficiency, effectively reduced calculation time and improved capturing instantaneity.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of the method provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, based on the embodiments of the invention, which are apparent to those of ordinary skill in the art without inventive faculty, are intended to be within the scope of the invention.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, based on the embodiments of the invention, which are apparent to those of ordinary skill in the art without inventive faculty, are intended to be within the scope of the invention.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
In the description of the present invention, it should be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", etc. indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
As shown in fig. 1, the first aspect of the present invention provides a method for controlling disordered grabbing of a 3D laser vision of a robot, which includes the following steps:
s10, acquiring a three-dimensional diagram of a robot, a clamp and a material frame;
s20, comparing the three-dimensional model of the imported workpiece with a 3D image shot by a camera, and identifying the pose of the workpiece;
s30, acquiring a pose conversion relation between the robot and the 3D camera, and completing hand-eye calibration between the robot and the 3D camera;
s40, acquiring a 2D image, a depth image and a point cloud view of a target workpiece stack obtained by a 3D camera, and forming a workpiece stack fusion image;
s50, calculating a workpiece pile fusion image according to a pre-trained unordered grabbing model to obtain an optimal grabbing workpiece and a pose thereof;
s60, controlling a robot to grasp the workpiece according to the optimal grasping workpiece and the pose thereof;
when the workpiece is grabbed, the workpiece with the least influence on the pose of other workpieces is grabbed optimally.
In step S10, obtaining a three-dimensional map of the robot, the fixture, and the material frame is one of key steps for implementing the robot 3D laser vision disordered grabbing control method. To obtain these three-dimensional maps, we need to first know the structure and parameters of the robot and then build a corresponding three-dimensional model. The specific implementation mode is as follows:
Acquiring structure and parameters of robot
First, the structure and parameters of the robot need to be acquired. The structure of a robot generally includes a plurality of joints and links, and the manner in which the joints and links are connected and the relative positional relationship therebetween determine the motion performance of the robot. Therefore, parameters such as length, angle, etc. of each joint and link of the robot need to be measured or queried. These parameters may be obtained from technical data provided by the robot manufacturer or from self-measurements.
Establishing a three-dimensional model of a robot
Based on the obtained robot structure and parameters, a three-dimensional model of the robot is built by using Computer-Aided Design (CAD) software, such as SolidWorks, pro/E, etc. In the modeling process, the following points need to be noted:
the model should reflect the structure and parameters of the actual robot as accurately as possible to ensure the precision in the subsequent grabbing control process;
each joint in the model needs to be set as a movable joint so as to realize the movement of the joint in the subsequent grabbing control process;
the three-dimensional model of the jig and the material frame should be included in the model so that their influence is taken into consideration in the subsequent grip control process.
Acquiring three-dimensional figures of robot, clamp and material frame
After the three-dimensional models of the robot, the jig and the material frame are established, the models need to be exported into a three-dimensional map. The three-dimensional map may use a general three-dimensional map format such as STL, OBJ, etc. These formats may be conveniently exchanged and processed between different computer platforms and software.
In deriving the three-dimensional map, attention is paid to the following points:
the resolution of the three-dimensional map should be set according to the actual requirements. The higher the resolution ratio is, the higher the precision of the three-dimensional graph is, but the larger the calculation amount is, and the real-time performance of the follow-up grabbing control process can be affected;
the coordinate system in the three-dimensional graph is consistent with the coordinate system of the actual robot so as to carry out coordinate transformation in the follow-up grabbing control process;
in order to facilitate subsequent processing, three-dimensional graphs of the robot, the clamp and the material frame can be respectively exported;
after three-dimensional modeling is completed, a coordinate system needs to be established
After the three-dimensional modeling is completed, a unified coordinate system needs to be established to describe the relative positional relationship among the robot, the fixture and the material frame. In general, we can take the coordinate system of the robot base as the world coordinate system (W) and build local coordinate systems (T) at the reference points of the jig and the frame, respectively C And T F ). By establishing a coordinate system, the pose relationship among the components can be conveniently described.
Coordinate transformation
In order to describe the relative positional relationship among the robot, the jig, and the material frame, coordinate transformation is required. The coordinate transformation may be performed by the following formula:
T WC =T WF ·T FC
wherein T is WC Representing the slave fixture coordinate system T C A transformation matrix to the world coordinate system W; t (T) WF Representing the slave frame coordinate system T F A transformation matrix to the world coordinate system W; t (T) FC Representing the slave fixture coordinate system T C To a frame coordinate system T F Is used for the transformation matrix of the (a). These transformation matrices may be obtained by three-dimensional modeling software or measurement devices (e.g., laser trackers, optical gauges, etc.).
Generating point cloud data of robots, fixtures and material frames
For subsequent workpiece recognition and grabbing plans, the three-dimensional model needs to be converted into point cloud data. This may be achieved by using a point cloud generation algorithm, such as MeshLab, pointCloudLibrary (PCL), etc. The point cloud data contains geometric information and color information of the model, and can be used for workpiece identification and pose estimation.
Data fusion
After the point cloud data of the robot, the fixture and the material frame are acquired, the data are fused into a unified data structure. This can be done by the following formula:
P W =T WC ·P C +T WF ·P F
Wherein P is W Representing the fused point cloud data; p (P) C Point cloud data representing the jig; p (P) F Point cloud data representing the material frame. Through data fusion, the relative position relation of the robot, the clamp and the material frame can be described under a unified coordinate system, basic data are provided for subsequent workpiece identification and grabbing plans, and interference is avoided when the grabbing path of the robot is planned.
In the above technical solution, the step of comparing the three-dimensional model of the introduced workpiece with the 3D image shot by the camera to identify the pose of the workpiece specifically includes:
preprocessing a three-dimensional model of a workpiece to obtain a point cloud representation;
extracting the features of the 3D image shot by the camera and the imported workpiece three-dimensional model by adopting a PFH or FPFH algorithm;
performing feature matching, and adopting nearest neighbor searching and a random sample consistency algorithm;
estimating a pose transformation matrix between the feature matching point pairs;
and applying the estimated pose transformation matrix to the imported workpiece three-dimensional model to obtain the pose of the workpiece three-dimensional model in the 3D image shot by the camera.
In step S20, we need to compare the three-dimensional model of the introduced workpiece with the 3D image captured by the camera, and identify the pose of the workpiece. The key of the step is to realize quick recognition and pose estimation of the workpiece in a disordered environment. To achieve this object, we can employ the following embodiments.
First, we need to pre-process the three-dimensional model of the workpiece. When importing a three-dimensional model of a workpiece, we can use a point cloud representation, i.e. discretizing the surface of the workpiece into a set of points. These points may be calculated by scanning the physical workpiece with a three-dimensional scanner or using a three-dimensional model derived using CAD software. The workpiece model represented by the point cloud can be conveniently compared and matched with the 3D image shot by the camera.
Next, we need to extract features from the 3D image taken by the camera. The characteristic is that mathematical quantity describing the surface shape, texture and other attributes of the object can be used for distinguishing different objects and estimating the pose of the object. To extract features in a 3D image, we can employ the following algorithm:
a point feature histogram (PFH, point Feature Histograms) algorithm is used to extract the local features of each point in the point cloud. The PFH features are derived by computing the geometric relationship between k-nearest neighbors of each point in the point cloud. Specifically, for each point p in the point cloud i We can find their k nearest neighbors p i1 ,p i2 ,...,p ik The normal vector angle difference α, angle β and curvature γ between these points are then calculated. The characteristic values are distributed in a multi-dimensional histogram to obtain a point p i PFH characteristics of (c).
A Fast Point Feature Histograms (FPFH) algorithm is used to extract the local features of each point in the point cloud. The FPFH feature is an improvement on the PFH feature and is obtained by calculating the relative position relationship between k neighbor points of each point in the point cloud. Specifically, for each point p in the point cloud i We can find their k nearest neighbors p i1 ,p i2 ,...,p ik The distance d, angle θ, and curvature Φ between these points are then calculated. The characteristic values are distributed in a multi-dimensional histogram to obtain a point p i Is a FPFH characteristic of (c).
After extracting the features of the 3D image captured by the camera and the imported three-dimensional model of the workpiece, we need to perform feature matching. Feature matching is to find out the corresponding point pairs by calculating the similarity between two sets of features. To achieve feature matching, we can employ the following algorithm:
the distance between the two sets of features is calculated using a nearest neighbor search (Nearest Neighbor Search, NNS) algorithm. Giving a feature f to be matched q We can be in the target feature set f 1 ,f 2 ,...,f n Find the feature f closest to it i . This process may be accelerated by kd-Tree, ball Tree (Ball Tree), or like data structures.
A random sample consensus (RANSAC) algorithm is used to estimate the pose transformation between pairs of feature matching points. Given a set of feature matching point pairs (p q1 ,p t1 ),(p q2 ,p t2 ),...,(p qm ,p tm ) We can find an optimal pose transformation matrix T using the RANSAC algorithm so that the distance between transformed pairs of points is minimized. Specifically, in the RANSAC algorithm, we can randomly select a pair of point subsets, and calculate the corresponding pose transformation matrix T i Then T is taken i The distance between the transformed point pairs is calculated, applied to all the point pairs. This process is repeated several times, and the pose transformation matrix T that minimizes the distance is selected as the optimal solution.
Finally, the estimated pose transformation matrix T can be applied to the imported workpiece three-dimensional model to obtain the pose of the workpiece three-dimensional model in the 3D image shot by the camera. Thus, the workpiece pose recognition is completed.
In another embodiment of this step, the following embodiments may be employed:
first, a 3D image photographed by a camera needs to be preprocessed. Due to the problems of illumination, shielding and the like in an actual application scene, noise and incomplete conditions may exist in the acquired 3D image. Therefore, before workpiece recognition, the 3D image needs to be subjected to noise reduction, filtering, and other processes to improve the accuracy of recognition. Specifically, the following method may be used:
Noise points in the 3D image are excluded using an outlier removal algorithm, such as a random sample consensus (RANSAC) algorithm.
The 3D image is subjected to smoothing processing, for example, using a gaussian filter or a bilateral filter, etc., to eliminate high-frequency noise in the image.
The isolated points and filled holes in the image are removed using morphological operations, such as open operations, closed operations, etc.
After the pretreatment is completed, the treated 3D image is required to be registered with a three-dimensional model of the workpiece so as to obtain the pose of the workpiece. In order to achieve efficient registration, the following method may be employed:
feature descriptors such as a Point Feature Histogram (PFH) or a Fast Point Feature Histogram (FPFH) are used to extract feature points in the 3D image and the workpiece three-dimensional model. The feature descriptors can effectively describe local geometric information of points in the point cloud, and are beneficial to realizing efficient point cloud registration.
And estimating a rigid body transformation matrix between the 3D image and the workpiece three-dimensional model by using a random sampling consistency (RANSAC) algorithm or a least square method and other methods, so as to realize coarse registration of the point cloud. Specifically, the following steps may be employed:
(1) Randomly selecting a point pair from the characteristic points of the 3D image and the workpiece three-dimensional model, and calculating the distance between the two point pairs;
(2) Setting a threshold value, taking the point pairs with the distance smaller than the threshold value as inner points and taking other point pairs as outer points;
(3) Solving a rigid body transformation matrix between inner points by a least square method and other methods;
(4) Repeating the above process for a plurality of times, and selecting the rigid body transformation matrix with the largest number of inner points as a rough registration result.
Based on coarse registration, fine registration of the point cloud is achieved using an Iterative Closest Point (ICP) algorithm or a variation thereof (e.g., point-to-plane ICP, nonlinear least squares, etc.). Specifically, the following steps may be employed:
(1) According to the rough registration result, initially aligning the 3D image with the workpiece three-dimensional model;
(2) Finding the nearest point in the three-dimensional model of the workpiece for each point in the 3D image;
(3) Calculating the distance between the point pairs, and solving the minimum distance and the corresponding rigid body transformation matrix;
(4) And updating the pose of the 3D image, and repeating the process until the preset convergence condition (such as iteration times, error thresholds and the like) is met.
By the method, registration of the 3D image and the workpiece three-dimensional model can be achieved, and therefore the pose of the workpiece is obtained. It should be noted that in practical applications, the above method may need to be appropriately adjusted and optimized for specific scenarios and requirements. For example, a deep learning method (e.g., pointNet, 3DMatch, etc.) may be used to extract point cloud features to improve accuracy and robustness of registration; meanwhile, strategies such as multi-view fusion, layering optimization and the like can be adopted, so that the registering effect is further improved.
In the above technical solution, the step of obtaining the pose conversion relationship between the robot and the 3D camera and completing the hand-eye calibration between the robot and the 3D camera specifically includes:
setting parameters of a 3D camera and a calibration plate;
controlling the 3D camera to shoot and collect calibration plate data of different poses;
calculating a calibration result according to the added calibration point column;
and optimizing and reducing errors of the calculation result according to the calibration precision, and finally obtaining the pose conversion relation between the robot and the 3D camera to finish the hand-eye calibration between the robot and the 3D camera.
In step S30, a pose conversion relationship between the robot and the 3D camera is obtained, and hand-eye calibration between the robot and the 3D camera is completed. The hand-eye calibration is an important link in a robot vision system, and mainly solves the coordinate system relation between the tail end of a robot arm and a camera. The purpose of hand-eye calibration is to solve a transformation matrix from the tail end of the robot to a camera coordinate system, so that the robot can position and grasp a target object according to image information shot by the camera.
In order to realize the hand-eye calibration, the following method can be adopted:
(1) Calibration plate-based method
In this method, a calibration plate with specific marks is required. The calibration plate may be a checkerboard, circular mark, or other identifiable feature point set. First, it is necessary to place a calibration plate at the end of the robot and collect images photographed by the camera at different positions and attitudes. By identifying feature points in the image, feature point coordinates in the camera coordinate system can be calculated. And simultaneously, recording pose information of the tail end of the robot at each position and each pose. With the data, a transformation matrix from the tail end of the robot to a camera coordinate system can be solved by adopting a hand-eye calibration algorithm such as a Tsai-Lenz algorithm, a DLT algorithm and the like.
Specifically, a Tsai-Lenz algorithm can be adopted for hand-eye calibration. The basic principle of the Tsai-Lenz algorithm is to convert the hand-eye calibration problem into a linear least squares problem. First, the coordinates of each feature point in the camera coordinate system need to be calculated, and the following formula may be used:
X c =RX w +T;
wherein X is c Representing coordinates of feature points in a camera coordinate system, X w Representing coordinates of the feature points in the world coordinate system, R representing the rotation matrix, and T representing the translation vector. Expanding the formula can result in:
[X c Y c Z c ]=[r 11 r 12 r 13 r 21 r 22 r 23 r 31 r 32 r 33 ][X w Y w Z w ]+[t x t y t z ];
the above formula can be converted into a system of linear equations due to the coordinates of the feature points in the world coordinate system and the coordinates in the camera coordinate system, and then the rotation matrix (R) and translation vector (T) are solved.
In practical use, a calibration plate can be installed at a proper position, a 3D calibration program of the robot is operated, communication parameters of Mech-Hub software and an industrial robot system are set, the robot is manually operated through Mech-Viz software, the control right of the industrial robot is obtained, 3D cameras and the parameters of the calibration plate are set on the Mech-Viz software, the 3D cameras are controlled to shoot and collect calibration plate data of different poses, a calibration point column is added, a calibration result is calculated, optimization and error analysis are carried out on the calculation result according to the calibration precision, finally, the pose conversion relation between the robot and the 3D cameras is obtained, and hand-eye calibration between the industrial robot and the 3D cameras is completed.
(2) Method based on non-calibration plate
Besides the calibration plate, the method of non-calibration plate can also be used for hand-eye calibration. The method mainly utilizes the motion information of the tail end of the robot at different positions and postures and the image information shot by the camera to solve the hand-eye calibration problem. Typical non-calibrated plate methods are self-alignment based methods, motion relative relationship based methods, and the like.
Taking the method based on automatic alignment as an example, it is first necessary to fix the robot tip in a position and let the camera take a still scene. The robot tip may then be rotated along an axis while the images taken by the camera are recorded. By analyzing the feature point motion in the image, feature point coordinates in the camera coordinate system can be calculated. And simultaneously, recording pose information of the tail end of the robot at each position and each pose. With the data, a transformation matrix from the tail end of the robot to a camera coordinate system can be solved by adopting a hand-eye calibration algorithm such as a method based on a motion relative relation.
In the above technical solution, the step of acquiring the 2D map, the depth map and the point cloud view of the target workpiece stack obtained by the 3D camera and forming the workpiece stack fusion image specifically includes:
Acquiring a 2D image, a depth image and a point cloud view of a target workpiece stack obtained by a 3D camera;
registering the 2D image and the depth image to obtain a depth image corresponding to the 2D image;
registering the point cloud data with the 2D image, and endowing color information to the point cloud data;
and carrying out downsampling and filtering processing on the fused point cloud data so as to reduce data quantity and noise.
In step S40, we need to acquire a 2D map, a depth map and a point cloud view of the target workcell stack obtained by the 3D camera and form a workcell stack fusion image. Key technologies involved in this process include: 2D image acquisition, depth image acquisition, point cloud data acquisition and generation of a fusion image.
2D image acquisition
And 2D image acquisition is to shoot a target workpiece stack through an RGB camera to acquire color information of the target workpiece stack. In this process, we need to calibrate the camera to eliminate distortion and obtain the internal and external parameters of the camera. The common camera calibration method includes Zhang Zhengyou calibration method and the like. After calibration, we can convert the points under the pixel coordinate system into the points under the camera coordinate system through the camera internal parameters.
Depth image acquisition
The depth image refers to an image in which the distance from each pixel point to the camera is recorded. Common depth image capture devices are ToF cameras, structured light cameras, etc. Similar to the 2D image, we also need to calibrate the depth camera. In the process of depth image acquisition, the problems of noise, invalid pixel points and the like of the depth image are required to be paid attention to, and corresponding filtering and complementing methods are adopted for processing.
Point cloud data acquisition
The point cloud data is a set of three-dimensional coordinate data representing discrete points of the surface of the target object. We can convert the pixels of the image into three-dimensional point cloud data by registration (alignment) of the 2D image and the depth image, and camera parameters. Common point cloud registration methods include ICP (Iterative Closest Point) algorithm and NDT (Normal Distribution Transform) algorithm.
Generation of fused images
After the acquisition of the 2D image, the depth image and the point cloud data is completed, the information needs to be fused into a workpiece pile fusion image. The fusion image contains color, depth and geometric information of the target workpiece stack, and provides rich input data for a subsequent unordered grabbing model. The specific fusion method is as follows:
first, the 2D image and the depth image are registered, and a depth image corresponding to the 2D image is obtained. Interpolation may be performed by bilinear interpolation or the like to obtain a depth image of the same size as the 2D image.
Secondly, registering the point cloud data with the 2D image, and endowing color information to the point cloud data. Here, the pixels of the 2D image may be mapped into the point cloud data by camera internal parameters, thereby assigning each point cloud point a corresponding color value.
And finally, carrying out downsampling and filtering processing on the fused point cloud data so as to reduce the data quantity and noise. The usual downsampling method is voxel grid filtering (Voxel Grid Filter), and the filtering method is statistical outlier filtering (Statistical Outlier Removal).
In addition, the workpiece pile fusion image can be obtained by the following method:
first, for each pixel point (u, v) in the 2D image, we can get the depth value z of that point from the depth image. We can then convert the pixel coordinates (u, v) to coordinates (x, y) in the normalized camera coordinate system using the camera's internal reference matrix K and distortion parameter d:
(x,y)=K -1 (u,v,1) T -d;
next, we can calculate the coordinates (X, Y, Z) of the point in the world coordinate system:
[XYZ]=z[xy1];
in this way, we can map each pixel point in the 2D image to the corresponding three-dimensional coordinates in the point cloud data. Meanwhile, color information (such as RGB values) in the 2D image is also endowed to the corresponding three-dimensional points to form point cloud data with the color information.
And finally, taking the point cloud data with the color information as a workpiece pile fusion image. The fusion image contains three-dimensional structure, color and texture information of the target workpiece stack, and can provide richer information for the subsequent unordered grabbing model.
Through the steps, the fusion image of the target workpiece stack can be obtained, and rich input data is provided for a subsequent unordered grabbing model.
In the above technical solution, the steps of establishing and training the unordered grabbing model specifically include:
constructing a training sample comprising a plurality of pairs of workpiece pile images; the workpiece pile image pair comprises a workpiece pile fusion image, which is marked as a first image, and a workpiece pile fusion image after one workpiece is randomly taken out in the first image, which is marked as a second image;
and establishing an unordered grabbing model prototype by using the convolutional neural network, and training by using a training sample to obtain an unordered grabbing model.
In step S50, we will operate on the workpiece stack fusion image according to the pre-trained unordered grabbing model to obtain the optimal grabbing workpiece and its coordinates and pose. The specific implementation mode is as follows:
feature extraction
First, we need to extract features from the workpiece stack fusion image that will be used for analysis of the trained unordered grabbing model. Feature extraction may use some common computer vision algorithms such as SIFT (scale invariant feature transform), SURF (accelerated robust feature), ORB (Oriented FAST and Rotated BRIEF), and the like. Wherein the process of collecting the image pair of the workpiece stack may be accomplished by:
(a) The workpieces are randomly placed in a material frame to form a disordered stacking state. The variety, shape and size of the workpieces in the workpiece stack are ensured to be diversified to ensure that the training sample has better generalization capability;
(b) Shooting a fusion image of the workpiece stack by using a 3D laser camera, wherein the fusion image comprises a 2D image, a depth image and a point cloud view; fusing the views together to form a first image;
(c) Randomly selecting a workpiece, and recording the position and the pose of the workpiece. And then removed from the stack of workpieces in a real or virtual environment;
(d) Shooting the fusion image of the workpiece stack again to obtain a second image;
(e) The first image, the second image and the position and pose information of the removed workpiece are taken as a training sample. Repeating the steps and collecting enough training samples.
Taking SIFT as an example, key points and descriptors thereof in a workpiece pile fusion image can be extracted for subsequent grabbing model analysis; the main steps of the SIFT algorithm comprise:
and (3) detecting a scale space extremum: detecting key points in images of different scales;
positioning key points: precisely determining the position and the scale of the key points;
key point direction distribution: assigning one or more directions to each keypoint;
Generating key point descriptors: generating a descriptor of the key point.
Grabbing model
After feature extraction of the workpiece pile fusion image, we need to analyze these features with a pre-trained unordered grabbing model to find the optimal grabbing workpiece and its coordinates and pose. The grasping model may be a deep learning-based method such as Convolutional Neural Network (CNN) or Recurrent Neural Network (RNN), etc.
Taking convolutional neural network as an example, we can input the extracted features into a trained CNN model, and calculate the grabbing probability of each workpiece through forward propagation. The structure of the CNN model typically includes multiple convolution layers, pooling layers, and full-join layers, where local and global features in the image can be automatically learned. To accommodate our problems, we can design a CNN model with the following structure:
(a) Input layer: the fused image is received as input.
(b) Convolution layer: the input image is convolved by using a plurality of convolution checks to extract local features.
(c) Pooling layer: and the output of the convolution layer is downsampled, so that the characteristic dimension is reduced, and the calculated amount is reduced.
(d) Full tie layer: the output of the pooling layer is connected to a fully connected layer to achieve nonlinear combination of features.
(e) Output layer: and outputting the predicted workpiece position and pose.
To train the CNN model, we need to define a loss function to measure the difference between the model predicted workpiece position and pose and the actual value. Common Loss functions are Mean Square Error (MSE) and Cross Entropy Loss (Cross-Entropy Loss). We can select the appropriate loss function and use a random gradient descent (SGD) or other optimization algorithm for model training.
Optimal gripping workpiece screening
After the grabbing probability of each workpiece is obtained, the optimal grabbing workpiece needs to be screened, namely, the workpiece with the least influence on the positions and the postures of other workpieces is needed when the workpiece is grabbed. This can be achieved by setting a gripping threshold, only workpieces with a gripping probability higher than the gripping threshold will be considered to be the best gripping workpieces.
The specific screening process is as follows:
sequencing the grabbing probabilities of all the workpieces;
starting from the workpiece with the highest grabbing probability, checking whether the workpiece meets the grabbing threshold condition;
if the grabbing threshold condition is met, taking the workpiece as an optimal grabbing workpiece; otherwise, continuing to check the next workpiece with higher grabbing probability;
the above process is repeated until an optimal gripping workpiece is found or all workpieces are inspected.
The above-mentioned grabbing threshold may be represented by using the similarity between the first image and the second image, that is, when the similarity is higher, it indicates that the pose change of the remaining workpiece is minimal after grabbing the first workpiece; in actual use, after the optimal workpiece is grabbed, the next step can directly adopt the first image to continuously calculate the next optimal grabbed workpiece, a new workpiece stack image obtained after grabbing is not required to be adopted to calculate the optimal grabbed workpiece, photographing steps can be reduced, image analysis calculated amount is reduced, and calculation efficiency is improved. In general, the capture threshold is calculated in a manner of 1-K 0 /P 0 Wherein K is 0 The representation coefficient is generally 2-10; p (P) 0 Representing the work observed by the surface energy in the image of the work stackThe number of pieces, preferably, K 0 =2。
Coordinate and pose calculation
After determining the optimal gripping of the workpiece, we need to calculate its coordinates and pose in the stack of workpieces. This may be accomplished by matching the extracted keypoints with a three-dimensional model of the workpiece. Common matching algorithms are RANSAC (random sample consensus) and ICP (iterative closest point), etc.
Taking RANSAC as an example, we can calculate the coordinates and pose of the best gripping workpiece by:
Randomly selecting a group of key point pairs, and calculating a transformation matrix between the key point pairs;
applying the transformation matrix to other key points, and calculating the distance between the transformed points and the target point;
counting the number of key point pairs meeting a distance threshold;
repeating the above process for a plurality of times, and selecting a transformation matrix with the most number of key points meeting the distance threshold as a final result; the distance threshold is the size of the optimal grabbing workpiece;
and calculating the coordinates and the pose of the optimally grabbed workpiece in the workpiece stack according to the final transformation matrix.
Result output
And outputting the coordinates and the pose of the optimal grabbing workpiece for grabbing by a robot in the following step S60.
Summarizing, in step S50, we need to calculate the workpiece pile fusion image according to the pre-trained unordered grabbing model, so as to obtain the optimal grabbing workpiece, its coordinates and pose. The specific implementation mode comprises feature extraction, grabbing model analysis, optimal grabbing workpiece screening, coordinate and pose calculation and the like.
In another embodiment of the present invention, step S50 may also be directly used to obtain the optimal gripping workpiece and its coordinates in a computer simulation manner.
In the above technical solution, the step of controlling the robot to grasp the workpiece according to the optimal grasping of the workpiece and the pose thereof specifically includes:
Step 1: recording the workpiece stack fusion image as a first fusion image;
step 2: controlling the robot to grasp the optimal grasping workpiece in the first fusion image;
step 3: after the optimal grabbing of the workpieces is completed, acquiring a second fusion image of the workpiece stack at the moment;
step 4: calculating the similarity of the second fusion image and the first fusion image, if the similarity is greater than or equal to a grabbing threshold value, deleting the image obtained by optimally grabbing the workpiece from the first fusion image as the first fusion image, and calculating the optimally grabbing workpiece of the first fusion image;
step 5: and (3) iteratively executing the steps 2-4 until the similarity is smaller than the grabbing threshold, taking the current image as a first fusion image, calculating the optimal grabbing workpiece of the first fusion image, and iteratively executing the steps 2-5.
Deleting the image obtained by optimally grabbing the workpiece from the first fusion image as the first fusion image, wherein the step specifically comprises the following steps: marking the optimal grabbing workpiece in the first fusion image; and deleting the marked optimal grabbing workpiece in the first fusion image, and taking the image at the moment as the first fusion image. Specifically, the treatment may be performed in the following manner:
Optimal gripping workpiece marking
After obtaining the position and pose information of the optimal gripping workpiece, we can use an image processing algorithm, such as edge detection, contour extraction, etc., to mark the optimal gripping workpiece in the first fused image. Specifically, the fusion image is firstly converted into a gray level image, and then the outline of the optimal grabbing workpiece is extracted from the gray level image according to the position and pose information of the optimal grabbing workpiece. Next, we can use an edge detection algorithm, such as the Canny edge detection algorithm, to further process the extracted contour to obtain the edge information of the best gripping workpiece. Finally, we can mark the edges of the optimally gripped workpiece with a specific color (e.g., red) in the first fused image for subsequent processing.
Deleting marked optimal grabbing workpiece
After the marking of the best-grip workpiece is completed, the marked best-grip workpiece needs to be deleted from the first fused image. This can be achieved by an image processing algorithm such as image dilation and erosion.
Firstly, the edge of the marked optimal grabbing workpiece in the first fused image can be expanded. In particular, one structural element (e.g., a rectangular or circular structural element) may be used to expand the marked edge to expand the edge to the interior of the optimally gripped workpiece. In this way we can get a region covering the best gripping of the workpiece.
The expanded region may then be subjected to an etching process to eliminate noise that may be generated during expansion. In particular, the same structural elements as those used in the expansion process can be used to erode the expanded region, thereby obtaining a noise-removed region that covers the optimal gripping workpiece.
Finally, we can delete the area in the first fused image that covers the best gripping workpiece using image processing techniques, such as image subtraction, etc. Specifically, the image subtraction operation can be performed on the first fused image and the corroded area covering the optimal grabbing workpiece, so that the first fused image with the optimal grabbing workpiece removed is obtained. In this way, we can continue to analyze and process the first fused image, which removes the optimally grabbed workpiece, in a subsequent step.
Further, in the above technical solution, in step 5, if the loop of step 2-4 is executed iteratively for more than 10-20 times, the current image is used as the first fused image, the optimal workpiece to be gripped of the first fused image is calculated, and step 2-step 5 is executed iteratively.
Further, in the above technical solution, step 2 controls the robot to grasp the optimal grasping workpiece in the first fused image, and further includes a step of accurately determining a pose of the optimal grasping workpiece, and specifically includes:
Deleting the part except the optimal grabbing workpiece in the first fusion image, and only reserving the optimal grabbing workpiece as an image to be grabbed;
and carrying out pose recognition on the optimal grabbing workpiece to obtain the pose of the optimal grabbing workpiece.
Wherein, in the first fused image, deleting the part other than the optimally grasped workpiece, and only the step of retaining the optimally grasped workpiece can be realized by setting the pixel points in the first fused image which do not belong to the range of the bounding box (B) to be transparent or background color. To delete portions other than the workpiece in the image, the following algorithm may be used:
a. traversing all pixels in the first fused image, denoted (x) i ,y i ,z i )。
b. Inspection pixel (x) i ,y i ,z i ) Whether or not it is located within the bounding box (B). The pixel point is illustrated to be within the bounding box if the following conditions are satisfied:
x min ≤x i ≤x max ][y min ≤y i ≤y max ][z min ≤z i ≤z max
c. if the pixel point (x i ,y i ,z i ) Outside the bounding box, it is set to be transparent or background. Specifically, the color value of the pixel may be ((0, 0)) (RGBA format, indicating transparency) or ((255, 255)) (RGB format, indicating white background).
Alternatively, the method is realized in the following way:
determination of bounding box for optimal gripping of workpiece
First, we need to determine the bounding box of the best gripping workpiece in the first fused image. To achieve this, we can employ a segmentation algorithm for 3D point clouds, such as a RANSAC-based planar segmentation algorithm. Specifically, we can find the corresponding point set in the 3D point cloud by using the coordinate and pose information of the optimally grasped workpiece. We can then calculate the smallest bounding rectangle of these points as the bounding box for the best gripping workpiece in the first fused image.
Creating a mask
Next, we need to create a binary mask of the same size as the first fused image. In this mask, the pixel values in the boundary box region of the optimally gripped workpiece are set to 1, and the pixel values of the remaining regions are set to 0. In this way, we can achieve the goal of retaining only the optimal gripping workpiece by applying a mask to the first fused image.
Specifically, we can create a mask using the following method:
where M (x, y) represents the pixel value of the mask at the (x, y) position.
Using masks
Finally, we can achieve the goal of retaining only the optimal gripping workpiece by applying a mask to the first fused image. Specifically, we can calculate the image after applying the mask using the following formula: i '(x, y) =i (x, y) ·m (x, y), where I (x, y) represents the pixel value of the first fused image at the (x, y) position, and I' (x, y) represents the pixel value of the image after mask application at the (x, y) position.
The method comprises the steps of performing pose recognition on an optimal grabbing workpiece to obtain a first method of the pose of the optimal grabbing workpiece:
first find and optimally grasp the workpiece j * Corresponding set of keypointsFor each key point->Can be obtained its coordinates in space (x i ,y i ,z i ). Then, principal Component Analysis (PCA) is performed on these coordinates, resulting in the principal axis direction of the workpiece. Let u be 1 、u 2 And u 3 The pose matrix R of the workpiece can be calculated for three main axis directions obtained by the main component analysis, namely:
R=[u 1 u 2 u 3 ];
next, the centroid coordinates (x c ,y c ,z c ) The method comprises the following steps:
where N is the set of keypoints j * Is of a size of (a) and (b).
Finally, combining the centroid coordinates and the pose matrix of the workpiece into a pose matrix T E R 4×4
The pose matrix represents the pose of the optimally grasped workpiece. The method is quick and efficient, and has small calculated amount.
And (3) carrying out pose recognition on the optimal grabbing workpiece to obtain a second method of the pose of the optimal grabbing workpiece:
identifying an image to be grabbed by using a trained pose identification model to obtain the pose of the optimal grabbed workpiece in the image to be grabbed, wherein the steps of establishing and training the pose identification model specifically comprise the following steps:
data preparation
In order to train the pose recognition model, a large amount of workpiece image data including 2D images, depth maps and point cloud data under different angles, illumination conditions and occlusion conditions needs to be collected first. Meanwhile, the data are required to be marked, and the real pose of each workpiece is recorded;
let the workpiece image dataset be Wherein I is i 2D image representing the ith sample, D i Representing a corresponding depth map, P i Representing point cloud data, T i And representing the real pose of the workpiece. The data set D contains N samples;
data preprocessing
To enhance the generalization ability of the model, data enhancement, such as translation, rotation, scaling, flipping, etc., may be performed on the data set. Meanwhile, carrying out normalization processing on the 2D image, the depth map and the point cloud data to ensure that the numerical range is between 0 and 1;
model construction
And constructing a deep learning model for extracting features from the input 2D image, depth map and point cloud data and predicting the pose of the workpiece. The multi-mode fusion method can be adopted to fuse the characteristics of the 2D image, the depth map and the point cloud data so as to improve the prediction performance of the model;
the model is set as a function M (I m ,D m ,P m The method comprises the steps of carrying out a first treatment on the surface of the θ), wherein I m 、D m And P m Respectively representing the input 2D image, depth map and point cloud data, θ representing parameters of the model. The model output is the predicted pose
Loss function
To evaluate the predictive performance of a model, a loss function needs to be definedFor measuring predicted pose +.>And the true pose T. Here the Mean Square Error (MSE) can be used as a loss function:
Model training
The loss function is minimized to update the model parameters θ using a random gradient descent (SGD) or other optimization algorithm. During each iteration, a small batch of samples is randomly drawn from the data set D. Then calculate the loss functionRegarding the gradient of the model parameter θ, and updating the parameters:
where ζ represents the learning rate, the step size used to control the parameter update.
Model verification and tuning
During training, a validation set is required to evaluate the generalization performance of the model. Training may be stopped when the performance of the model on the validation set reaches expectations. In addition, the prediction performance of the model can be improved by adjusting super parameters such as a model structure, a loss function, an optimization algorithm and the like.
Model testing
After training is completed, a separate test set is used to evaluate the predictive performance of the model. The samples in the test set should not be repeated with the samples in the training set and the validation set to ensure reliability of the evaluation results.
This approach provides greater accuracy in identifying the optimal gripping of the workpiece.
A second aspect of the present invention provides a computer readable storage medium, where the computer readable storage medium stores program instructions, where the program instructions are executed to perform a method for controlling robot 3D laser vision disorder grabbing as described above.
A third aspect of the present invention provides a robotic 3D laser vision chaotic grasping control system, comprising the computer readable storage medium described above.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (10)

1. The robot 3D laser vision disordered grabbing control method is characterized by comprising the following steps of:
s10, acquiring a three-dimensional diagram of a robot, a clamp and a material frame;
s20, comparing the three-dimensional model of the imported workpiece with a 3D image shot by a camera, and identifying the pose of the workpiece;
s30, acquiring a pose conversion relation between the robot and the 3D camera, and completing hand-eye calibration between the robot and the 3D camera;
s40, acquiring a 2D image, a depth image and a point cloud view of a target workpiece stack obtained by a 3D camera, and forming a workpiece stack fusion image;
s50, calculating a workpiece pile fusion image according to a pre-trained unordered grabbing model to obtain an optimal grabbing workpiece and a pose thereof;
S60, controlling a robot to grasp the workpiece according to the optimal grasping workpiece and the pose thereof;
and when the optimal workpiece grabbing is the workpiece grabbing, the workpiece with the least influence on the pose of other workpieces is grabbed.
2. The method for controlling disordered grabbing of 3D laser vision of a robot according to claim 1, wherein the step of comparing the three-dimensional model of the introduced workpiece with the 3D image shot by the camera to identify the pose of the workpiece specifically comprises the steps of:
preprocessing a three-dimensional model of a workpiece to obtain a point cloud representation;
extracting the features of the 3D image shot by the camera and the imported workpiece three-dimensional model by adopting a PFH or FPFH algorithm;
performing feature matching, and adopting nearest neighbor searching and a random sample consistency algorithm;
estimating a pose transformation matrix between the feature matching point pairs;
and applying the estimated pose transformation matrix to the imported workpiece three-dimensional model to obtain the pose of the workpiece three-dimensional model in the 3D image shot by the camera.
3. The method for controlling disordered grabbing of 3D laser vision of a robot according to claim 1, wherein the step of obtaining the pose conversion relationship between the robot and the 3D camera and completing the hand-eye calibration between the robot and the 3D camera comprises the following steps:
Setting parameters of a 3D camera and a calibration plate;
controlling the 3D camera to shoot and collect calibration plate data of different poses;
calculating a calibration result according to the added calibration point column;
and optimizing and reducing errors of the calculation result according to the calibration precision, and finally obtaining the pose conversion relation between the robot and the 3D camera to finish the hand-eye calibration between the robot and the 3D camera.
4. The method for controlling disordered grabbing of 3D laser vision of a robot according to claim 1, wherein the step of acquiring the 2D map, the depth map and the point cloud view of the target workpiece stack obtained by the 3D camera and forming the workpiece stack fusion image specifically comprises the following steps:
acquiring a 2D image, a depth image and a point cloud view of a target workpiece stack obtained by a 3D camera;
registering the 2D image and the depth image to obtain a depth image corresponding to the 2D image;
registering the point cloud data with the 2D image, and endowing color information to the point cloud data;
and carrying out downsampling and filtering processing on the fused point cloud data so as to reduce data quantity and noise.
5. The method for controlling the disordered grabbing of the 3D laser vision of the robot according to claim 1, wherein the steps of establishing and training the disordered grabbing model specifically comprise the following steps:
Constructing a training sample comprising a plurality of pairs of workpiece pile images; the workpiece pile image pair comprises a workpiece pile fusion image, which is marked as a first image, and a workpiece pile fusion image after one workpiece is randomly taken out in the first image, which is marked as a second image;
and establishing an unordered grabbing model prototype by using the convolutional neural network, and training by using a training sample to obtain an unordered grabbing model.
6. The method for controlling disordered grabbing of 3D laser vision of a robot according to claim 1, wherein the step of controlling the robot to grab the workpiece according to the optimal grabbing of the workpiece and the pose thereof specifically comprises the following steps:
step 1: recording the workpiece stack fusion image as a first fusion image;
step 2: controlling the robot to grasp the optimal grasping workpiece in the first fusion image;
step 3: after the optimal grabbing of the workpieces is completed, acquiring a second fusion image of the workpiece stack at the moment;
step 4: calculating the similarity of the second fusion image and the first fusion image, if the similarity is greater than or equal to a grabbing threshold value, deleting the image obtained by optimally grabbing the workpiece from the first fusion image as the first fusion image, and calculating the optimally grabbing workpiece of the first fusion image;
Step 5: and (3) iteratively executing the steps 2-4 until the similarity is smaller than the grabbing threshold, taking the current image as a first fusion image, calculating the optimal grabbing workpiece of the first fusion image, and iteratively executing the steps 2-5.
7. The method for controlling disordered grabbing of 3D laser vision of a robot according to claim 6, wherein in step 5, if the loop of step 2-4 is performed iteratively more than 10-20 times, the current image is used as the first fused image, the optimal grabbing workpiece of the first fused image is calculated, and steps 2-5 are performed iteratively.
8. The method for controlling disordered grabbing of the 3D laser vision of the robot according to claim 6, wherein the step 2 controls the robot to grab the optimal grabbing workpiece in the first fused image, and the method further comprises the step of accurately determining the pose of the optimal grabbing workpiece, specifically comprising:
deleting the part except the optimal grabbing workpiece in the first fusion image, and only reserving the optimal grabbing workpiece as an image to be grabbed;
and carrying out pose recognition on the optimal grabbing workpiece to obtain the pose of the optimal grabbing workpiece.
9. A computer readable storage medium, wherein program instructions are stored in the computer readable storage medium, and the program instructions are used for executing a robot 3D laser vision disorder grabbing control method according to any one of claims 1-8 when running.
10. A robotic 3D laser vision disorder capture control system comprising the computer-readable storage medium of claim 9.
CN202310674934.2A 2023-06-07 2023-06-07 Robot 3D laser vision disordered grabbing control method, medium and system Pending CN116587280A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310674934.2A CN116587280A (en) 2023-06-07 2023-06-07 Robot 3D laser vision disordered grabbing control method, medium and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310674934.2A CN116587280A (en) 2023-06-07 2023-06-07 Robot 3D laser vision disordered grabbing control method, medium and system

Publications (1)

Publication Number Publication Date
CN116587280A true CN116587280A (en) 2023-08-15

Family

ID=87595566

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310674934.2A Pending CN116587280A (en) 2023-06-07 2023-06-07 Robot 3D laser vision disordered grabbing control method, medium and system

Country Status (1)

Country Link
CN (1) CN116587280A (en)

Similar Documents

Publication Publication Date Title
CN112476434B (en) Visual 3D pick-and-place method and system based on cooperative robot
CN112070818B (en) Robot disordered grabbing method and system based on machine vision and storage medium
CN113524194B (en) Target grabbing method of robot vision grabbing system based on multi-mode feature deep learning
CN109934847B (en) Method and device for estimating posture of weak texture three-dimensional object
CN111476841B (en) Point cloud and image-based identification and positioning method and system
JP2012042396A (en) Position attitude measurement device, position attitude measurement method, and program
JP6912215B2 (en) Detection method and detection program to detect the posture of an object
CN111598172B (en) Dynamic target grabbing gesture rapid detection method based on heterogeneous depth network fusion
Tran et al. Non-contact gap and flush measurement using monocular structured multi-line light vision for vehicle assembly
CN113393439A (en) Forging defect detection method based on deep learning
CN111583342A (en) Target rapid positioning method and device based on binocular vision
JP2015111128A (en) Position attitude measurement device, position attitude measurement method, and program
JPH07103715A (en) Method and apparatus for recognizing three-dimensional position and attitude based on visual sense
JP2021163502A (en) Three-dimensional pose estimation by multiple two-dimensional cameras
CN117351078A (en) Target size and 6D gesture estimation method based on shape priori
JP5462662B2 (en) Position / orientation measurement apparatus, object identification apparatus, position / orientation measurement method, and program
CN114782535B (en) Workpiece pose recognition method and device, computer equipment and storage medium
CN114608522B (en) Obstacle recognition and distance measurement method based on vision
CN116587280A (en) Robot 3D laser vision disordered grabbing control method, medium and system
CN112102397B (en) Method, equipment and system for positioning multilayer part and readable storage medium
JPH07146121A (en) Recognition method and device for three dimensional position and attitude based on vision
WO2022104449A1 (en) Pick and place systems and methods
CN110728222B (en) Pose estimation method for target object in mechanical arm grabbing system
Peng et al. Real time and robust 6D pose estimation of RGBD data for robotic bin picking
Zhang et al. Robotic grasp detection using effective graspable feature selection and precise classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination