CN111906782B - Intelligent robot grabbing method based on three-dimensional vision - Google Patents

Intelligent robot grabbing method based on three-dimensional vision Download PDF

Info

Publication number
CN111906782B
CN111906782B CN202010652696.1A CN202010652696A CN111906782B CN 111906782 B CN111906782 B CN 111906782B CN 202010652696 A CN202010652696 A CN 202010652696A CN 111906782 B CN111906782 B CN 111906782B
Authority
CN
China
Prior art keywords
grabbing
point
network
grasping
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010652696.1A
Other languages
Chinese (zh)
Other versions
CN111906782A (en
Inventor
兰旭光
赵冰蕾
张翰博
郑南宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202010652696.1A priority Critical patent/CN111906782B/en
Publication of CN111906782A publication Critical patent/CN111906782A/en
Application granted granted Critical
Publication of CN111906782B publication Critical patent/CN111906782B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1694Programme controls characterised by use of sensors other than normal servo-feedback from position, speed or acceleration sensors, perception control, multi-sensor controlled systems, sensor fusion
    • B25J9/1697Vision controlled systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30164Workpiece; Machine component

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an intelligent robot grabbing method based on three-dimensional vision, which takes observation point cloud containing a target object as input, and uses a point grabbing confidence evaluation network of a deep convolutional network to evaluate the grabbing confidence of each point in the observation point cloud so as to obtain a point suitable for being used as the center of a grabbing part. And taking the characteristics of the grabbing area as input, and using an area grabbing part detection network based on a grabbing anchor point mechanism to detect the object grabbing part. And taking the feature of the captured closed region and the feature of the captured region after fusion as input, and optimizing the detected captured part by using a captured part optimization network. And selecting the grabbing part with the highest grabbing quality index, and obtaining the position and the posture of the grabbing part in the robot coordinate system through coordinate system transformation so as to plan the clamp posture of the robot. The invention ensures that the robot accurately grabs different types of objects in the unstructured environment so as to improve the safety and reliability of the intelligent robot operation and the interaction with the outside.

Description

Intelligent robot grabbing method based on three-dimensional vision
Technical Field
The invention belongs to the field of computer vision and intelligent robots, and particularly relates to an intelligent robot grabbing method based on three-dimensional vision.
Background
The intelligent robot grabbing plays a vital role in robot operation and interaction with the outside, and at present, robot grabbing algorithms based on machine vision are mainly divided into algorithms based on models and data driving. Compared with the traditional model-based robot grabbing method, the data-driven grabbing method can basically grab objects in an unstructured environment. However, due to the uncertainty caused by the different shape structures of various objects and sensor noise, it is still a difficult task to reliably grasp different kinds of objects in an unstructured environment. At present, a rectangular grabbing area is generated mostly based on RGB, depth or RGB-D images based on a data-driven robot grabbing method, but the grabbing positions of a parallel clamp of a robot in a three-dimensional space are simplified, so that the methods lack consideration on geometric information of the surface of an object and grabbing quality indexes, and the optimal grabbing position is difficult to find. Therefore, the three-dimensional vision-based intelligent robot grabbing method can learn a more robust representation of the robot grabbing part in point cloud, and compared with a rectangular grabbing representation, the grabbing representation in a three-dimensional space can more accurately represent the grabbing posture of a clamp with high grabbing quality indexes. However, the more accurate detection of the grabbing in the three-dimensional space helps to perform more reliable object grabbing, and for this reason, how to accurately detect the grabbing parts with higher grabbing quality in the three-dimensional space to obtain a robot grabbing algorithm capable of reliably grabbing different kinds of objects in an unstructured environment is a current outstanding problem.
Disclosure of Invention
The invention aims to overcome the defects and provide an intelligent robot grabbing method based on three-dimensional vision, which can accurately detect grabbing parts with higher grabbing quality in a three-dimensional space so as to ensure that a robot can reliably grab different types of objects in an unstructured environment, thereby realizing the reliability and safety of interaction between robot operation and the outside.
In order to achieve the purpose, the invention adopts the following technical means:
an intelligent robot grabbing method based on three-dimensional vision comprises the following steps:
taking observation point cloud containing a target object as input, and using a point capture confidence evaluation network of a depth convolution network to evaluate the capture confidence of each point in the observation point cloud to obtain a point suitable for being used as the center of a capture part;
taking the characteristics of a grabbing area as input, and using an area grabbing part detection network based on a grabbing anchor point mechanism to detect the object grabbing part;
taking the feature of the captured closed region and the captured region after feature fusion as input, and optimizing the detected captured part by using a captured part optimization network to obtain an optimized captured part;
and selecting the grabbing part with the highest grabbing quality index, and obtaining the position and the posture of the grabbing part in the robot coordinate system through coordinate system transformation so as to plan the clamp posture of the robot and finish the robot operation.
As a further improvement of the invention, the observation point cloud is obtained under the current scene through a Kinect sensor.
As a further improvement of the present invention, the method for establishing the capturing confidence evaluation network comprises:
and inputting the input observation point cloud into a trained point cloud feature extraction network, wherein the training loss is the classification loss of the capturing confidence of each point, and the network parameters are optimized by minimizing the loss function to obtain a capturing confidence evaluation network model.
As a further improvement of the invention, the specific process for obtaining the point suitable for being used as the center of the grabbing part is as follows:
coding all input point clouds to group characteristics through a trained point cloud characteristic extraction network, and decoding the group characteristics to point-by-point characteristics through a distance difference method so as to realize the characteristic extraction of the input point clouds; and then, segmenting the extracted features of each point through a trained two-classification segmentation network, wherein each point predicted as a positive class is suitable for being used as the center of the captured part.
As a further improvement of the present invention, the object capture area detection is to predict the corresponding capture area by using a point predicted to be suitable as the center of the capture area as a regression center through a regional capture area detection network.
As a further improvement of the present invention, the specific process of predicting the corresponding capture region is as follows:
selecting k1 regression points which cover different structures as much as possible by a farthest point sampling method to obtain a grasping area of k1 points; a grabbing benchmark with a preset grabbing direction is introduced into each grabbing area, the characteristics of each grabbing area are used as input, and deviation values of the position, the direction and the angle of a grabbing part relative to the preset grabbing benchmark are regressed through a maximum pooling layer and a multilayer sensor; and combining the regression deviation value and the classification result of the preset standard to obtain k1 detected grasping part proposals.
As a further improvement of the invention, the classification and regression of the gripping references requires first matching of the calibrated gripping location with a preset gripping reference; and when the difference value of the grabbing directions of the preset reference and the calibration grabbing part is smaller than a specific threshold value, matching is successful, a positive label is given to the successfully matched preset reference, and the residual error needing regression is the difference value of the calibration grabbing part and the preset reference corresponding to the positive label. The training loss includes the loss of classification of the preset grasping reference and the regression loss of the deviation values of the position, the direction and the angle of the grasping portion relative to the preset grasping reference. And training the network by minimizing the loss function to obtain a model of the region capture part detection network.
As a further improvement of the invention, the optimization obtains a more accurate grasping part, and the specific process is as follows:
selecting proposals that k2 corresponding grabbing closed areas contain more than a certain number of points from the obtained grabbing part proposals for optimization; converting points in the selected grabbing closed area from a world coordinate system to a grabbing coordinate system through coordinate conversion, and acquiring the characteristics of k2 grabbing closed areas through a multilayer sensor; combining the characteristics of the grabbing closed area with the characteristics of the grabbing area to obtain corresponding fusion characteristics; and (4) taking the fusion characteristics as input, and regressing deviation values of the positions, the directions and the angles of the grabbing parts relative to the grabbing proposal through a maximum pooling layer and a multilayer perceptron to finally obtain k2 optimized predicted grabbing parts.
As a further improvement of the invention, the proposal is preferably that the characteristic after the combination of the gripping area characteristic and the clamp closed area is taken as input, and the input is sent into a maximum pooling layer and a multilayer sensor; in the training process, firstly classifying k2 predicted grasping part proposals, and if the predicted grasping part proposals are close to the corresponding marked grasping parts, giving positive labels to the predicted grasping part proposals; when the deviation of the grasping portion is regressed, only the position, direction and angle of the grasping portion proposal given the positive label are optimized.
As a further improvement of the invention, the close gripping position means that the difference between the gripping direction and the gripping angle is less than 2 pi/9 and pi/3 respectively.
Compared with the prior art, the invention has the following advantages:
the intelligent robot grabbing method based on the three-dimensional vision is based on observation point cloud, grabbing parts with high grabbing quality indexes can be detected by providing a grabbing area, a grabbing anchor point mechanism and a grabbing part optimization network, so that the precision of three-dimensional grabbing part detection is greatly improved, meanwhile, due to the arrangement of point cloud feature extraction network parameter sharing, the forward propagation time of the whole network is not increased, and the real-time performance of the algorithm is improved.
Based on a deep learning algorithm, the point capture confidence evaluation network, the area capture part detection network and the capture part optimization network extract network parameters by sharing point cloud characteristics, so that the real-time performance of the algorithm is improved; the grabbing position with high grabbing quality index can be detected by using the grabbing area, the grabbing anchor point mechanism and the grabbing position optimization network, and the detection precision of the three-dimensional grabbing position is improved. Based on the three-dimensional point cloud observed by the depth camera, the invention can ensure that the robot can accurately grab different types of objects in an unstructured environment so as to improve the safety and reliability of the intelligent robot operation and the interaction with the outside.
Drawings
The conception, the specific structure and the technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, the features and the effects of the present invention.
FIG. 1 is a process framework diagram of the present invention;
FIG. 2 is a schematic diagram of a point capture confidence evaluation network of the present invention;
FIG. 3 is a schematic diagram of a region capture site detection network of the present invention;
FIG. 4 is a schematic diagram of a grasping portion optimizing network of the present invention;
fig. 5 is a schematic diagram illustrating a detection result of a grasping portion of a visual object.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
The invention discloses an intelligent robot grabbing method based on three-dimensional vision, which comprises the following steps:
and taking the observation point cloud containing the target object as input, and using a point capture confidence evaluation network of the deep convolutional network to evaluate the capture confidence of each point in the observation point cloud to obtain a point suitable for being used as the center of a capture part. And taking the characteristics of the grabbing area as input, and using an area grabbing part detection network based on a grabbing anchor point mechanism to detect the object grabbing part.
And taking the feature of the captured closed region and the feature of the captured region after fusion as input, and optimizing the detected captured part by using a capture part optimization network to obtain a more accurate captured part.
And selecting the grabbing part with the highest grabbing quality index, and obtaining the position and the posture of the grabbing part in the robot coordinate system through coordinate system transformation so as to plan the clamp posture of the robot and finish the robot operation.
As shown in fig. 1, the invention relates to an intelligent robot grabbing method based on three-dimensional vision, which comprises the following steps:
the method comprises the following steps: acquiring an observation point cloud P under a current scene through a Kinect sensor;
step two: and finishing the grasping confidence evaluation of each point in the observation point cloud P under the current scene through a point grasping confidence evaluation network so as to obtain a point suitable for being used as the center of the grasping part.
As shown in fig. 2, the specific process is as follows:
coding of all input point clouds to group features is achieved through a trained point cloud feature extraction network, and decoding of the group features to point-by-point features is achieved through a distance difference method, so that feature extraction of the input point clouds is achieved. Then, the features extracted from each point are segmented through a trained two-classification segmentation network, and each point predicted as a positive class is suitable for being used as the center of a captured part;
preferably, the input observation point cloud P is fed into the trained point cloud feature extraction network. For training a point grabbing confidence evaluation network, the method and the device add an index of point grabbing confidence in the generated grabbing data set, and visually represent the density of grabbing parts with high grabbing quality indexes near each point in the observation point cloud. The training loss is the classification loss of the grasping confidence of each point, and the network parameters are optimized by minimizing the loss function to obtain a grasping confidence evaluation network model.
Step three: and (4) predicting the corresponding captured part by using the point predicted to be suitable as the center of the captured part in the step one as a regression center through a regional captured part detection network.
As shown in fig. 3, the specific process is as follows: multiple grabbing modes exist for one object, so that all the normal class points in the step one do not need to be used as regression points, and then k1 regression points which cover different structures as much as possible are selected through a farthest point sampling method, so that a grabbing area (a sphere area with the points as centers) of the k1 points is obtained; a grabbing benchmark with a preset grabbing direction is introduced into each grabbing area, the characteristics of each grabbing area are used as input, and deviation values of the position, the direction and the angle of a grabbing part relative to the preset grabbing benchmark are regressed through a maximum pooling layer and a multilayer sensor; and combining the regression deviation value and the classification result of the preset standard to obtain k1 detected grasping part proposals.
Preferably, k1 points are selected from the normal points obtained by the step two classification as regression points, and the feature extracted by extracting the point cloud feature corresponding to the captured region with the k1 points as the center is selected as the captured region feature. And inputting the characteristics of the grabbing areas as input, sending the characteristics into a maximum pooling layer and a multilayer sensor, and outputting k1 grabbing position proposals.
As a preferred embodiment, in order to achieve higher positioning accuracy of the grasping portion, classification and regression based on the grasping reference are adopted in the present application instead of a method of directly regressing the grasping portion. The classification and regression based on the grabbing criteria requires first matching of the calibrated grabbing part with the preset grabbing criteria. And when the difference value of the grabbing directions of the preset reference and the calibration grabbing part is smaller than a specific threshold value, matching is successful, a positive label is given to the successfully matched preset reference, and the residual error needing regression is the difference value of the calibration grabbing part and the preset reference corresponding to the positive label. The training loss includes the loss of classification of the preset grasping reference and the regression loss of the deviation values of the position, the direction and the angle of the grasping portion relative to the preset grasping reference. And training the network by minimizing the loss function to obtain a model of the region capture part detection network. In the step, the detection precision is improved by extracting the characteristics of the grabbing area and the proposed grabbing anchor point mechanism, and compared with a mode of direct regression by using single-point characteristics, the precision is improved by 5.79% on the grabbing data set constructed by the method.
Step four: and (4) optimizing to obtain a more accurate grasping part on the basis of the grasping part predicted in the step three through a grasping part optimization network.
As shown in fig. 4, the specific process is as follows:
compared with the grabbing area, the information contained in the clamp closed area of the grabbing part proposal obtained in the step three is closer to the real grabbing part; selecting proposals of which k2 corresponding grasping closed regions contain more than 50 points from the obtained grasping part proposals for optimization; converting points in the selected grabbing closed area from a world coordinate system to a grabbing coordinate system through coordinate conversion, and acquiring the characteristics of k2 grabbing closed areas through a multilayer sensor; combining the characteristics of the grabbing closed area with the characteristics of the grabbing area to obtain corresponding fusion characteristics; and (4) taking the fusion characteristics as input, and performing regression on deviation values of the positions, the directions and the angles of the grabbing parts relative to the grabbing proposal obtained in the step three through a maximum pooling layer and a multilayer sensor to finally obtain k2 optimized predicted grabbing parts.
Preferably, the maximum pooling layer and the multilayer perceptron are fed by taking the fused characteristics of the gripping area characteristics and the clamp closing area as input. In the training process, the application firstly classifies k2 predicted grabbing part proposals, and if the predicted grabbing part proposals are close to the corresponding marked grabbing parts (the difference between the grabbing direction and the angle is less than 2 pi/9 and pi/3 respectively), the predicted grabbing part proposals are endowed with positive labels. When the deviation of the grasping portion is regressed, only the position, direction and angle of the grasping portion proposal given the positive label are optimized.
The final loss function thus consists of the classification loss of the grasping-site proposal and the regression loss of the positively labeled grasping-site proposal. And the training of the network is completed by minimizing the loss function through a random gradient descent method. The performance of the grabbing part optimization network is improved by 1.11% on the grabbing data set constructed by the method.
Step five: and selecting the grabbing part with the highest grabbing quality index from the predicted grabbing parts obtained by the grabbing part optimization network, and obtaining the position and the posture of the grabbing part in a robot coordinate system through coordinate system transformation so as to plan the posture of the clamp during the operation of the robot and finish the operation of the robot.
Supplemental simulation example
Step six: the invention uses the effective grabbing proportion as an evaluation standard to evaluate the performance of the method on the grabbing part detection task in the three-dimensional space. The process of calculating the effective grabbing proportion comprises the following steps: firstly, acquiring all grabbing parts given with positive labels and output by a grabbing part optimization network, and recording the number of the output grabbing parts as k; then, converting the grabbing parts into an object coordinate system according to the mapping relation between the object coordinate system in the data set and a world coordinate system, and calculating the grabbing mass fraction of the grabbing parts converted into the object coordinate system on the corresponding object; counting the number kT of the positive example grabbing parts according to a set grabbing quality score threshold; the effective grabbing proportion is the ratio of kT to k.
The effective grabbing proportion of the invention on the test set reaches 92.47 percent, which is far beyond the grabbing part detection algorithms in the previous three-dimensional space. The invention respectively counts the detection performance of the grabbing parts of the objects in the training set and the detection performance of the grabbing parts of the objects not in the training set, and the result is shown in table 1.
TABLE 1
Training objects appearing in the set Mustard bottle Gelatin box Banana Peach (Chinese character)
Effective capture ratio 94.41% 99.30% 99.92% 87.28%
Objects not present in training set Candy box Pudding box Golf ball Hammer
Effective capture ratio 87.06% 97.45% 85.76% 72.44%
Fig. 5 visualizes the capture-site detection results for these objects, where the objects in the first row are objects present in the training set, the objects in the second row are objects not present in the training set, blue are positive capture sites, and red are negative capture sites.
In conclusion, based on the deep learning algorithm, the point capture confidence evaluation network, the area capture part detection network and the capture part optimization network extract the parameters of the network by sharing point cloud characteristics, so that the real-time performance of the algorithm is improved; the grabbing position with high grabbing quality index can be detected by using the grabbing area, the grabbing anchor point mechanism and the grabbing position optimization network, and the detection precision of the three-dimensional grabbing position is improved. Based on the three-dimensional point cloud observed by the depth camera, the invention can ensure that the robot can accurately grab different types of objects in an unstructured environment so as to improve the safety and reliability of the intelligent robot operation and the interaction with the outside.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only show some embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
It is to be understood that the above description is intended to be illustrative, and not restrictive. Many embodiments and many applications other than the examples provided would be apparent to those of skill in the art upon reading the above description. The scope of the present teachings should, therefore, be determined not with reference to the above description, but should instead be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. The disclosures of all articles and references, including patent applications and publications, are hereby incorporated by reference for all purposes. The omission in the foregoing claims of any aspect of subject matter that is disclosed herein is not intended to forego such subject matter, nor should the applicant consider that such subject matter is not considered part of the disclosed subject matter.

Claims (8)

1. An intelligent robot grabbing method based on three-dimensional vision is characterized by comprising the following steps:
taking observation point cloud containing a target object as input, and using a point capture confidence evaluation network of a depth convolution network to evaluate the capture confidence of each point in the observation point cloud to obtain a point suitable for being used as the center of a capture part;
the method for establishing the capturing confidence evaluation network comprises the following steps:
inputting the input observation point cloud into a trained point cloud feature extraction network, wherein the training loss is the classification loss of the capturing confidence of each point, and the network parameters are optimized by minimizing the loss function to obtain a capturing confidence evaluation network model;
the specific process for obtaining the point suitable for being used as the center of the grabbing part is as follows:
coding all input point clouds to group characteristics through a trained point cloud characteristic extraction network, and decoding the group characteristics to point-by-point characteristics through a distance difference method so as to realize the characteristic extraction of the input point clouds; then, the features extracted from each point are segmented through a trained two-classification segmentation network, and each point predicted as a positive class is suitable for being used as the center of a captured part;
taking the characteristics of a grabbing area as input, and using an area grabbing part detection network based on a grabbing anchor point mechanism to detect the object grabbing part;
taking the feature of the captured closed region and the captured region after feature fusion as input, and optimizing the detected captured part by using a captured part optimization network to obtain an optimized captured part;
and selecting the grabbing part with the highest grabbing quality index, and obtaining the position and the posture of the grabbing part in the robot coordinate system through coordinate system transformation so as to plan the clamp posture of the robot and finish the robot operation.
2. The method of claim 1,
the observation point cloud is obtained under the current scene through a Kinect sensor.
3. The method of claim 1,
the object grabbing part detection is to predict a corresponding grabbing part by taking a point predicted to be suitable as the center of the grabbing part as a regression center through a regional grabbing part detection network.
4. The method of claim 3,
the specific process of predicting the corresponding grasping part is as follows:
selecting k1 regression points which cover different structures as much as possible by a farthest point sampling method to obtain a grasping area of k1 points; a grabbing benchmark with a preset grabbing direction is introduced into each grabbing area, the characteristics of each grabbing area are used as input, and deviation values of the position, the direction and the angle of a grabbing part relative to the preset grabbing benchmark are regressed through a maximum pooling layer and a multilayer sensor; and combining the regression deviation value and the classification result of the preset standard to obtain k1 detected grasping part proposals.
5. The method of claim 4,
the classification and regression of the grabbing benchmark require matching of a calibrated grabbing part and a preset grabbing benchmark; when the difference value of the grabbing directions of the preset reference and the calibration grabbing part is smaller than a specific threshold value, matching is successful, a positive label is given to the successfully matched preset reference, and the residual error needing to be regressed is the difference value of the calibration grabbing part and the preset reference corresponding to the positive label; the training loss comprises the loss of classification of the preset grabbing reference and the regression loss of deviation values of the position, the direction and the angle of the grabbing part relative to the preset grabbing reference; and training the network by minimizing the loss function to obtain a model of the region capture part detection network.
6. The method of claim 1,
the optimization obtains a more accurate grasping part, and the specific process is as follows:
selecting proposals that k2 corresponding grabbing closed areas contain more than a certain number of points from the obtained grabbing part proposals for optimization; converting points in the selected grabbing closed area from a world coordinate system to a grabbing coordinate system through coordinate conversion, and acquiring the characteristics of k2 grabbing closed areas through a multilayer sensor; combining the characteristics of the grabbing closed area with the characteristics of the grabbing area to obtain corresponding fusion characteristics; and (4) taking the fusion characteristics as input, and regressing deviation values of the positions, the directions and the angles of the grabbing parts relative to the grabbing proposal through a maximum pooling layer and a multilayer perceptron to finally obtain k2 optimized predicted grabbing parts.
7. The method of claim 6,
the proposal is preferably carried out by taking the characteristic of the gripping area fused with the characteristic of the clamp closed area as input and sending the input into a maximum pooling layer and a multilayer sensor; in the training process, firstly classifying k2 predicted grasping part proposals, and if the predicted grasping part proposals are close to the corresponding marked grasping parts, giving positive labels to the predicted grasping part proposals; when the deviation of the grasping portion is regressed, only the position, direction and angle of the grasping portion proposal given the positive label are optimized.
8. The method of claim 7,
the close grasping position means that the difference between the grasping direction and the angle is less than 2 pi/9 and pi/3 respectively.
CN202010652696.1A 2020-07-08 2020-07-08 Intelligent robot grabbing method based on three-dimensional vision Active CN111906782B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010652696.1A CN111906782B (en) 2020-07-08 2020-07-08 Intelligent robot grabbing method based on three-dimensional vision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010652696.1A CN111906782B (en) 2020-07-08 2020-07-08 Intelligent robot grabbing method based on three-dimensional vision

Publications (2)

Publication Number Publication Date
CN111906782A CN111906782A (en) 2020-11-10
CN111906782B true CN111906782B (en) 2021-07-13

Family

ID=73227691

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010652696.1A Active CN111906782B (en) 2020-07-08 2020-07-08 Intelligent robot grabbing method based on three-dimensional vision

Country Status (1)

Country Link
CN (1) CN111906782B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112297013B (en) * 2020-11-11 2022-02-18 浙江大学 Robot intelligent grabbing method based on digital twin and deep neural network
CN113065392A (en) * 2021-02-24 2021-07-02 苏州盈科电子有限公司 Robot tracking method and device
CN113674348B (en) * 2021-05-28 2024-03-15 中国科学院自动化研究所 Object grabbing method, device and system
CN115249333B (en) * 2021-06-29 2023-07-11 达闼科技(北京)有限公司 Grabbing network training method, grabbing network training system, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108247635A (en) * 2018-01-15 2018-07-06 北京化工大学 A kind of method of the robot crawl object of deep vision
CN109079786A (en) * 2018-08-17 2018-12-25 上海非夕机器人科技有限公司 Mechanical arm grabs self-learning method and equipment
CN109159113A (en) * 2018-08-14 2019-01-08 西安交通大学 A kind of robot manipulating task method of view-based access control model reasoning
CN109919151A (en) * 2019-01-30 2019-06-21 西安交通大学 A kind of robot vision reasoning grasping means based on ad-hoc network
CN110211180A (en) * 2019-05-16 2019-09-06 西安理工大学 A kind of autonomous grasping means of mechanical arm based on deep learning
CN108021891B (en) * 2017-12-05 2020-04-14 广州大学 Vehicle environment identification method and system based on combination of deep learning and traditional algorithm

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6983524B2 (en) * 2017-03-24 2021-12-17 キヤノン株式会社 Information processing equipment, information processing methods and programs
JP6937995B2 (en) * 2018-04-05 2021-09-22 オムロン株式会社 Object recognition processing device and method, and object picking device and method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021891B (en) * 2017-12-05 2020-04-14 广州大学 Vehicle environment identification method and system based on combination of deep learning and traditional algorithm
CN108247635A (en) * 2018-01-15 2018-07-06 北京化工大学 A kind of method of the robot crawl object of deep vision
CN109159113A (en) * 2018-08-14 2019-01-08 西安交通大学 A kind of robot manipulating task method of view-based access control model reasoning
CN109079786A (en) * 2018-08-17 2018-12-25 上海非夕机器人科技有限公司 Mechanical arm grabs self-learning method and equipment
CN109919151A (en) * 2019-01-30 2019-06-21 西安交通大学 A kind of robot vision reasoning grasping means based on ad-hoc network
CN110211180A (en) * 2019-05-16 2019-09-06 西安理工大学 A kind of autonomous grasping means of mechanical arm based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于视觉推理的机器人多物体堆叠场景抓取方法;兰旭光等;《中国科学:技术科学》;20181123;第48卷(第12期);第1341-1356页 *

Also Published As

Publication number Publication date
CN111906782A (en) 2020-11-10

Similar Documents

Publication Publication Date Title
CN111906782B (en) Intelligent robot grabbing method based on three-dimensional vision
CN113450408B (en) Irregular object pose estimation method and device based on depth camera
CN112297013B (en) Robot intelligent grabbing method based on digital twin and deep neural network
CN111080693A (en) Robot autonomous classification grabbing method based on YOLOv3
CN109159113B (en) Robot operation method based on visual reasoning
CN115816460B (en) Mechanical arm grabbing method based on deep learning target detection and image segmentation
CN110929795B (en) Method for quickly identifying and positioning welding spot of high-speed wire welding machine
CN109284779A (en) Object detection method based on deep full convolution network
CN114299150A (en) Depth 6D pose estimation network model and workpiece pose estimation method
CN114693661A (en) Rapid sorting method based on deep learning
WO2023124734A1 (en) Object grabbing point estimation method, apparatus and system, model training method, apparatus and system, and data generation method, apparatus and system
CN110969660A (en) Robot feeding system based on three-dimensional stereoscopic vision and point cloud depth learning
CN115330734A (en) Automatic robot repair welding system based on three-dimensional target detection and point cloud defect completion
CN111598172A (en) Dynamic target grabbing posture rapid detection method based on heterogeneous deep network fusion
Laili et al. Custom grasping: A region-based robotic grasping detection method in industrial cyber-physical systems
Kayhan et al. Hallucination in object detection—a study in visual part verification
Yu et al. A novel robotic pushing and grasping method based on vision transformer and convolution
Cheng et al. Anchor-based multi-scale deep grasp pose detector with encoded angle regression
CN117058476A (en) Target detection method based on random uncertainty
CN113420839B (en) Semi-automatic labeling method and segmentation positioning system for stacking planar target objects
Shi et al. A fast workpiece detection method based on multi-feature fused SSD
Li et al. Robot vision model based on multi-neural network fusion
Yu et al. A cascaded deep learning framework for real-time and robust grasp planning
CN113359738A (en) Mobile robot path planning method based on deep learning
Suzui et al. Toward 6 dof object pose estimation with minimum dataset

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant