CN115631401A - Robot autonomous grabbing skill learning system and method based on visual perception - Google Patents
Robot autonomous grabbing skill learning system and method based on visual perception Download PDFInfo
- Publication number
- CN115631401A CN115631401A CN202211652001.5A CN202211652001A CN115631401A CN 115631401 A CN115631401 A CN 115631401A CN 202211652001 A CN202211652001 A CN 202211652001A CN 115631401 A CN115631401 A CN 115631401A
- Authority
- CN
- China
- Prior art keywords
- robot
- grabbing
- neural network
- network architecture
- point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 230000016776 visual perception Effects 0.000 title claims description 6
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 25
- 238000012545 processing Methods 0.000 claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 17
- 230000033001 locomotion Effects 0.000 claims abstract description 14
- 238000013528 artificial neural network Methods 0.000 claims abstract description 11
- 230000036544 posture Effects 0.000 claims description 11
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 230000009286 beneficial effect Effects 0.000 claims description 5
- 238000001514 detection method Methods 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 5
- 230000000007 visual effect Effects 0.000 claims description 5
- 230000000295 complement effect Effects 0.000 claims description 4
- 230000009193 crawling Effects 0.000 claims 7
- 230000002567 autonomic effect Effects 0.000 claims 2
- 230000004438 eyesight Effects 0.000 abstract description 8
- 230000008447 perception Effects 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000004836 empirical method Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000007115 recruitment Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
- Manipulator (AREA)
Abstract
The invention discloses a vision perception robot autonomous grabbing skill learning system and a vision perception robot autonomous grabbing skill learning method, wherein the system comprises a data processing module, a data acquisition module and a data processing module, wherein the data processing module is used for acquiring images, and marking positions which can be grabbed by clamping and positions which cannot be grabbed by clamping in each acquired image to obtain marked images; the model training module is used for building a lightweight generative convolutional neural network architecture, performing supervised learning on the marked image by using the built lightweight generative convolutional neural network architecture to obtain the optimal target grabbing position and posture, and storing the optimal network parameters; the model deployment module is used for loading the stored optimal network parameters, reading in images acquired by the camera, and reasoning the read images by utilizing the built neural network architecture to obtain the robot control quantity; and the motion planning module is used for planning a collision-free track among the starting point, the grabbing point and the terminal point of the robot according to the control quantity of the robot. The invention effectively improves the grabbing efficiency of the robot.
Description
Technical Field
The invention relates to the field of robots, in particular to a system and a method for learning robot independent grabbing skills through visual perception.
Background
In recent years, the development of the robot technology is gradually relieving the problems of intensive manual labor, aggravated aging of population, difficult enterprise recruitment and the like in China, wherein the robot grabbing technology is that target objects are sequentially taken out of a stack of unordered objects, is a key link in automation scenes such as logistics sorting, loading and unloading of machine tool workpieces, stacking and the like, can reduce the workload of workers, improves the working efficiency, and works continuously for 24 hours. Robot grabbing mainly comprises three key subtasks: the method comprises the steps of object identification and positioning, grabbing pose generation and motion planning, and the subtasks are advanced layer by layer, so that the method is an important process for realizing the autonomous grabbing task of the robot. The object identification and positioning task is to take a picture by using a camera and acquire the position information of a target based on the picture; the grabbing pose generation task is to determine the direction and the posture of the target in a three-dimensional space and then judge the optimal grabbing point position of the target, so that grabbing failure caused by the fact that a grabbing point robot cannot execute the grabbing point position is avoided; and the motion planning part controls the robot or the actuator to move to a corresponding position, avoids collision and a singular point position of a robot joint, and finishes a grabbing task.
As the robot autonomous grabbing operation comprises three sub-tasks which are very challenging, better effect can be obtained only by the cooperation of the various engineers in different fields, the efficiency is low, the labor cost is increased for related enterprises, and the obstruction is formed for the automation process of related products.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides the vision-perceived robot autonomous grabbing skill learning system and method, realizes that the grabbing control quantity of the robot is directly obtained by a three-dimensional vision picture, and effectively improves grabbing efficiency.
In order to realize the purpose, the technical scheme of the invention is as follows:
in a first aspect, the present invention provides a vision-aware robot autonomous grasping skill learning system, including:
the data processing module is used for acquiring images and marking positions which can be clamped and grabbed and positions which can not be clamped and grabbed in each acquired image to obtain marked images;
the model training module is used for building a lightweight generative convolutional neural network architecture, performing supervised learning on the marked image by using the built lightweight generative convolutional neural network architecture to obtain the optimal target grabbing position and posture, and storing the optimal network parameters;
the model deployment module is used for loading the stored optimal network parameters, reading in images acquired by the camera, and reasoning the read images by using the built neural network architecture to obtain the robot control quantity;
and the motion planning module is used for planning the collision-free track among the starting point, the grabbing point and the terminal point of the robot according to the control quantity of the robot.
Further, in the data processing module, the image acquisition comprises acquiring pictures disclosed on the internet and pictures shot by a local three-dimensional camera;
the pictures shot by the local three-dimensional camera are acquired in the following mode:
the method comprises the following steps of locally shooting a color image, a depth image and a point cloud image of an object under a simple background, and reserving a background image without a target object; the camera is adopted to fix the visual angle, the object placing position is changed, three different postures of each group of objects are subjected to data enhancement and checking, and the pictures which are not beneficial to processing are timely subjected to complementary shooting and enhanced processing.
Further, the lightweight generated convolutional neural network architecture comprises:
three convolutional layers, respectively: convolutional layers 9x9,32Filters, step 3; convolutional layers of 5x5,32filters, step size 2; convolutional layer 3x3,8filters, step 2;
three transposed convolution layers, respectively: transpose the convolutional layers 3x3,8filters, step size 2; transposing the convolution layer 3x3, 16Filters, step size 2; transpose convolutional layers 9x9,32Filters, step 3.
Further, the lightweight generative convolutional neural network architecture performs a mapping from picture to grab target pose:
;represent the rows and columns respectivelyAnda picture, a picture matrix;in order to grasp the pose of the object,is an integer set;is the position of the target, and is,in order to obtain the target attitude angle,in order to have the width of the opened clamping jaws,indicating the expected probability of success of the current grab.
Further, the motion planning module plans collision-free tracks between a starting point, a grabbing point and an end point of the robot according to the robot control quantity and by using a collision detection algorithm based on a model describing the robot and the obstacle by a hierarchical envelope box.
In a second aspect, the invention provides a vision-perceived robot autonomous grasping skill learning method, which includes:
and (3) data processing: collecting images, and marking positions which can be clamped and grabbed and positions which can not be clamped and grabbed in each collected image to obtain marked images;
model training: building a lightweight generative convolutional neural network architecture, performing supervised learning on the marked image by using the built lightweight generative convolutional neural network architecture to obtain the optimal target grabbing position and posture, and storing the optimal network parameters;
model deployment step: loading the stored optimal network parameters, reading in images acquired by a camera, and reasoning the read images by using the built neural network architecture to obtain robot control quantity;
and (3) movement planning step: and planning a collision-free track between the starting point, the grabbing point and the terminal point of the robot according to the control quantity of the robot.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a vision perception robot autonomous grabbing skill learning system and method based on advanced technologies such as artificial intelligence, machine vision, robots and the like, designs a light neural network framework special for complex object grabbing tasks, can directly obtain multiple control quantities required by robot grabbing through three-dimensional vision pictures, and greatly improves grabbing efficiency.
Drawings
Fig. 1 is a general framework diagram of a vision-aware robot autonomous grasping skill learning system provided in embodiment 1 of the present invention;
fig. 2 is a flowchart of a specific working principle of the vision-aware robot autonomous grasping skill learning system according to embodiment 1 of the present invention;
FIG. 3 is a labeling result of positive and negative examples in the data processing module;
FIG. 4 is a schematic diagram of a lightweight convolutional neural network architecture;
FIG. 5 is a result of network identification for deployment;
fig. 6 shows the results of building a hierarchical envelope box and planning a collision-free trajectory.
Detailed Description
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1:
object pose estimation is a key problem for a robot grabbing task. Compared with the traditional plane vision, the three-dimensional vision integrates an active structured light or grating emitter, is insensitive to the change of illumination, has the same three-dimensional information as the real world due to the imaging characteristic not being a plane any more, has richer characteristics, and does not have the phenomenon of 'big or small in size'. The traditional object identification and pose estimation method mainly extracts artificially designed points, lines, edges and other features, such as algorithms of two-dimensional feature description operators SIFT, SURF, ORB, hough and three-dimensional feature operators FPFH, SHOT and the like. However, the artificial features are easily interfered by dynamic targets and illumination, and still need to be debugged in real scenes, which results in extremely low efficiency and poor universality, and is easily influenced by uncertainties such as external interference, task change, complex object structure, environmental noise, robot errors, sensing errors and the like in an unstructured dynamic environment, and thus, the requirements of practical application are difficult to meet. In recent years, with the deep fusion of deep learning and machine vision, the deep neural network has been used for visual feature extraction and learning, which has better adaptability and generalization capability, and has gained wide attention.
The generation of the grabbing pose mainly comprises an analytical method and an empirical method (a data sampling method), wherein the analytical method simulates a clamping jaw and a grabbing target of the robot through a real physical model or a simulation environment, and the closed grabbing behavior of the target is analyzed to convert the closed grabbing behavior into an optimized problem to be solved. The analysis method needs to establish a grabbing model in advance, determine an optimization target and an optimization function, has large calculation amount and large early-stage input workload, and is not suitable for the situations of multiple types of industrial scene targets and quick transformation. Unlike analytical methods, empirical methods rely on classifying and ranking, preferably selecting, candidate capturers sampled from an image or point cloud according to a particular index. The empirical method needs to process data in advance and has low search efficiency, and in most cases, the operations of identifying the target and extracting and grabbing candidate points are separated, so that the calculation time is long from several seconds to tens of seconds, the efficiency depends on a search algorithm, and the speed is slow. Therefore, the prior art is rarely used for closed-loop grabbing execution, even in a static environment, grabbing can be successfully carried out only by means of accurate camera calibration and accurate robot control, and popularization and application are difficult.
The robot autonomous grabbing skill learning is an important research aspect crossing the fields of artificial intelligence, machine vision, robots and the like, the robot is enabled to have autonomous grabbing skills through a deep neural network and visual perception, control quantities such as grabbing points, grabbing poses, grabbing angles and the like of objects which are seen or not seen are obtained, grabbing tasks of the robot are standardized and automated, and teaching-free and rapid deployment of a robot system is achieved.
The invention provides a vision-aware robot autonomous grabbing skill learning system based on advanced technologies such as artificial intelligence, machine vision, robots and the like, designs a lightweight neural network framework special for complex object grabbing tasks, can directly obtain multiple control quantities required by robot grabbing through three-dimensional vision pictures, and greatly improves grabbing efficiency.
Specifically, referring to fig. 1, the vision-aware robot autonomous grasping skill learning system provided in this embodiment mainly includes a data processing module, a model training module, a model deployment module, and a motion planning module.
The specific operation principle of each module is described in detail below with reference to fig. 2:
the data processing module is used for collecting images, and marking positions which can be clamped and grabbed and positions which can not be clamped and grabbed in each collected image to obtain marked images.
Since a large amount of image capture data is required for training the deep learning network in the model training step described below, in this embodiment, image acquisition is performed by combining images published on the internet and images taken by a local three-dimensional camera. The color image, the depth image and the point cloud image of the object under the simple background are shot locally, a background image without the target object is reserved, and subsequent background difference processing is facilitated to extract the target. In the embodiment, a mode of fixing a visual angle and changing the placement position of objects is adopted, and each group of objects has three different postures; then, data enhancement and manual check are carried out by utilizing an image processing technology, and pictures which are not beneficial to processing are timely subjected to complementary shooting and enhancement processing; as shown in fig. 3, software is used to mark the grabbing positions in the color image and the depth image, where rectangular square boxes are used to mark the positions of the end clips, the positions of the clips are divided into two major categories, positions that can be clamped (positive examples) and positions that cannot be clamped (negative examples), and the two categories of each image are marked respectively. Storing the pixel values of four vertexes of each rectangle, and outputting the pixel values to a text file for calling of subsequent model training; finally, the annotated data was partitioned into training and test sets by 80% and 20%, respectively.
The model training module is used for building a lightweight generative convolutional neural network architecture, the built lightweight generative convolutional neural network architecture is used for performing supervised learning on the marked image to obtain the optimal target grabbing position and posture, and the optimal network parameters are stored.
Specifically, as shown in fig. 4, the lightweight convolutional neural network architecture includes three convolutional layers, which are: convolutional layers 9x9,32filters, step size 3; convolutional layers of 5x5,32filters, step size 2; convolutional layers 3x3,8filters, step 2; three transposed convolution layers, respectively: transpose the convolutional layers 3x3,8filters, step size 2; transposed convolutional layers 3x3, 16Filters, step size 2; transpose convolutional layers 9x9,32Filters, step 3. Therefore, by adopting the generated convolutional neural network architecture, the robot grabbing control quantity can be directly obtained from the depth map, and the grabbing efficiency is effectively improved. The specific principle is as follows: definition of the gripping point in a planeHere, theWhich represents the point of grasping of the object,indicating the angle of rotation of the end clip about the z-axis in a plane,the width of the open clip is indicated because the size of the object is available with a three-dimensional camera, so this width information can be directly obtained from the depth map. Finally, it isThe expected probability of success of the current grab is indicated. The proposed model architecture utilizes a deep neural network to accomplish a mapping from a picture to a grabbed target pose:;represent the rows and columns respectivelyAnda picture, a picture matrix;for grabbing object position,Is an integer set;is the position of the target, and is,in order to obtain the target attitude angle,in order to have the width of the opened clamping jaws,indicating the expected probability of success of the current grab (grab quality). The relation formed by four quantities of the grabbing pose is also regarded as a multidimensional matrix, and the actual network reaction is the optimal mapping relation between the two matrices. The model training module firstly initializes model parameters, inputs labeled picture data, carries out multiple iterative training on the network and calculates a loss function, adjusts the weight and the learning rate of the network by using a BP algorithm to carry out multiple rounds of training, and finally stores the optimal network parameters according to the training result.
The model deployment module directly loads the stored optimal network parameters during specific application, reads in photos collected by a camera, utilizes the built lightweight generative convolutional neural network frame to carry out reasoning to obtain control quantities such as a robot grabbing point, a grabbing pose, a grabbing angle, grabbing quality and the like, transfers pixel coordinates to a robot coordinate system according to a hand-eye calibration result, and outputs joint angles required to rotate when the robot reaches a target. The deployed network identification results are shown in fig. 5.
Aiming at the grabbing skill of the robot, after an object is identified and the optimal grabbing point is judged, the grabbing point, the placing point and the like of the robot need to be planned, and the safe operation of the robot is guaranteed. For this purpose, the motion planning module establishes an envelope of a main target object for a scene according to a robot control amount and using a collision detection algorithm that describes a model of the robot and an obstacle based on an hierarchical envelope Box (OBB), and plans collision-free trajectories between a start point, a grasp point, and an end point, as shown in fig. 6, which mainly relates to collision detection, path search, path smoothing, and robot action execution.
In summary, compared with the prior art, the invention has the following technical advantages:
(1) The light weight generation type convolutional neural network architecture special for the autonomous learning of the grabbing control quantity of the complex object is provided, the control quantities such as the grabbing point, the grabbing pose, the grabbing angle and the grabbing quality of the robot can be automatically obtained only by taking a depth image as input, and the programming and deployment efficiency of the robot is improved;
(2) The vision-aware robot autonomous grabbing skill learning system is built, integration of four key modules including data processing, model training, model deployment and motion planning is achieved through the advanced technologies such as machine vision and robot skill learning, and the application requirement of complex object grabbing in an unstructured environment is met.
Example 2:
the embodiment provides a robot autonomous grasping skill learning method based on visual perception, which comprises the following steps:
and (3) data processing: collecting images, and marking positions which can be clamped and grabbed and positions which can not be clamped and grabbed in each collected image to obtain marked images;
model training: building a lightweight generative convolutional neural network architecture, performing supervised learning on the marked image by using the built lightweight generative convolutional neural network architecture to obtain the optimal target grabbing position and posture, and storing the optimal network parameters;
model deployment: loading the stored optimal network parameters, reading in images acquired by a camera, and reasoning the read images by using the built neural network architecture to obtain robot control quantity;
and (3) movement planning step: and planning a collision-free track between the starting point, the grabbing point and the end point of the robot according to the control quantity of the robot.
The specific principle and flow of the above steps are the same as the working principle of each module in the above embodiment 1, and are not described again in this embodiment.
The above embodiments are only for illustrating the technical concept and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention accordingly, and not to limit the protection scope of the present invention accordingly. All equivalent changes or modifications made in accordance with the spirit of the present disclosure are intended to be covered by the scope of the present disclosure.
Claims (10)
1. A vision-aware robotic autonomic grasping skill learning system, comprising:
the data processing module is used for acquiring images and marking positions which can be clamped and grabbed and positions which can not be clamped and grabbed in each acquired image to obtain marked images;
the model training module is used for building a lightweight generative convolutional neural network architecture, performing supervised learning on the marked image by using the built lightweight generative convolutional neural network architecture to obtain the optimal target grabbing position and posture, and storing the optimal network parameters;
the model deployment module is used for loading the stored optimal network parameters, reading in images acquired by the camera, and reasoning the read images by using the built neural network architecture to obtain the robot control quantity;
and the motion planning module is used for planning a collision-free track among the starting point, the grabbing point and the terminal point of the robot according to the control quantity of the robot.
2. The vision-aware robotic autonomous grasping skill learning system according to claim 1, wherein in the data processing module, the image acquisition includes acquiring pictures published on the internet and pictures taken by a local three-dimensional camera;
the pictures shot by the local three-dimensional camera are acquired in the following mode:
the method comprises the following steps of locally shooting a color image, a depth image and a point cloud image of an object under a simple background, and reserving a background image without a target object; the camera is adopted to fix the visual angle, the object placing position is changed, three different postures of each group of objects are subjected to data enhancement and checking, and the pictures which are not beneficial to processing are timely subjected to complementary shooting and enhanced processing.
3. The vision-aware robotic autonomic crawling skill learning system of claim 1, wherein the lightweight generative convolutional neural network architecture comprises:
three convolutional layers, respectively: convolutional layers 9x9,32Filters, step 3; convolutional layers 5x5,32Filters, step 2; convolutional layer 3x3,8filters, step 2;
three transposed convolution layers, respectively: transpose the convolutional layers 3x3,8filters, step size 2; transposing the convolution layer 3x3, 16Filters, step size 2; transpose convolutional layers 9x9,32Filters, step 3.
4. The vision-aware robotic autonomous crawling skill learning system of claim 1 or 3, wherein said lightweight generative convolutional neural network architecture performs a mapping from picture to crawling target pose:
;represent the rows and columns respectivelyAnda picture, a picture matrix;in order to grasp the pose of the object,is an integer set;is the position of the target, and is,in order to obtain the target attitude angle,in order to have the width of the opened clamping jaws,indicating the predicted probability of success of the current grab.
5. The vision-aware robot autonomous grasping skill learning system according to claim 1, wherein the motion planning module plans collision-free trajectories between a robot start point-grasp point-end point according to robot control quantities and using a collision detection algorithm based on a model describing the robot and the obstacle with a hierarchical envelope box.
6. A robot autonomous grasping skill learning method based on visual perception is characterized by comprising the following steps:
and (3) data processing: collecting images, and marking positions which can be clamped and grabbed and positions which can not be clamped and grabbed in each collected image to obtain marked images;
model training: building a lightweight generative convolutional neural network architecture, performing supervised learning on the marked image by using the built lightweight generative convolutional neural network architecture to obtain the optimal target grabbing position and posture, and storing the optimal network parameters;
model deployment step: loading the stored optimal network parameters, reading in images acquired by a camera, and reasoning the read images by using the built neural network architecture to obtain the robot control quantity;
and (3) movement planning step: and planning a collision-free track between the starting point, the grabbing point and the terminal point of the robot according to the control quantity of the robot.
7. The vision-aware robot autonomous crawling skill learning method according to claim 6, wherein in the data processing step, the capturing images includes capturing pictures published on the internet and pictures taken by a local three-dimensional camera;
the pictures shot by the local three-dimensional camera are acquired in the following mode:
the method comprises the following steps of locally shooting a color image, a depth image and a point cloud image of an object under a simple background, and reserving a background image without a target object; the camera is adopted to fix the visual angle, the object placing position is changed, three different postures of each group of objects are subjected to data enhancement and checking, and the pictures which are not beneficial to processing are timely subjected to complementary shooting and enhanced processing.
8. The vision-aware robot-autonomous crawling skill learning method of claim 6, wherein said lightweight generative convolutional neural network architecture comprises:
three convolutional layers, respectively: convolutional layers 9x9,32filters, step size 3; convolutional layers of 5x5,32filters, step size 2; convolutional layers 3x3,8filters, step 2;
three transposed convolution layers, respectively: transpose the convolutional layers 3x3,8filters, step size 2; transposing the convolution layer 3x3, 16Filters, step size 2; transpose convolutional layers 9x9,32Filters, step 3.
9. The vision-aware robot autonomous crawling skill learning method of claim 6 or 8, wherein said lightweight generative convolutional neural network architecture performs a mapping from picture to crawling target pose:
;represent the rows and columns respectivelyAnda picture, a picture matrix;in order to grasp the pose of the object,is an integer set;is the position of the target, and is,in order to obtain the target attitude angle,in order to have the width of the opened clamping jaws,indicating the expected probability of success of the current grab.
10. The vision-aware robot autonomous grasping skill learning method according to claim 6, wherein in the motion planning step, collision-free trajectories between a robot start point, a grasping point, and an end point are planned in accordance with a robot control amount and using a collision detection algorithm based on a model describing a robot and an obstacle by a hierarchical envelope box.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211652001.5A CN115631401A (en) | 2022-12-22 | 2022-12-22 | Robot autonomous grabbing skill learning system and method based on visual perception |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211652001.5A CN115631401A (en) | 2022-12-22 | 2022-12-22 | Robot autonomous grabbing skill learning system and method based on visual perception |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115631401A true CN115631401A (en) | 2023-01-20 |
Family
ID=84910690
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211652001.5A Pending CN115631401A (en) | 2022-12-22 | 2022-12-22 | Robot autonomous grabbing skill learning system and method based on visual perception |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115631401A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117549317A (en) * | 2024-01-12 | 2024-02-13 | 深圳威洛博机器人有限公司 | Robot grabbing and positioning method and system |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120294509A1 (en) * | 2011-05-16 | 2012-11-22 | Seiko Epson Corporation | Robot control system, robot system and program |
CN102922521A (en) * | 2012-08-07 | 2013-02-13 | 中国科学技术大学 | Mechanical arm system based on stereo visual serving and real-time calibrating method thereof |
CN108805977A (en) * | 2018-06-06 | 2018-11-13 | 浙江大学 | A kind of face three-dimensional rebuilding method based on end-to-end convolutional neural networks |
CN110334701A (en) * | 2019-07-11 | 2019-10-15 | 郑州轻工业学院 | Collecting method based on deep learning and multi-vision visual under the twin environment of number |
CN111360862A (en) * | 2020-02-29 | 2020-07-03 | 华南理工大学 | Method for generating optimal grabbing pose based on convolutional neural network |
CN112365004A (en) * | 2020-11-27 | 2021-02-12 | 广东省科学院智能制造研究所 | Robot autonomous anomaly restoration skill learning method and system |
US20210312629A1 (en) * | 2020-04-07 | 2021-10-07 | Shanghai United Imaging Intelligence Co., Ltd. | Methods, systems and apparatus for processing medical chest images |
CN114332209A (en) * | 2021-12-30 | 2022-04-12 | 华中科技大学 | Grabbing pose detection method and device based on lightweight convolutional neural network |
CN114723775A (en) * | 2021-01-04 | 2022-07-08 | 广州中国科学院先进技术研究所 | Robot grabbing system and method based on small sample learning |
-
2022
- 2022-12-22 CN CN202211652001.5A patent/CN115631401A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120294509A1 (en) * | 2011-05-16 | 2012-11-22 | Seiko Epson Corporation | Robot control system, robot system and program |
CN102922521A (en) * | 2012-08-07 | 2013-02-13 | 中国科学技术大学 | Mechanical arm system based on stereo visual serving and real-time calibrating method thereof |
CN108805977A (en) * | 2018-06-06 | 2018-11-13 | 浙江大学 | A kind of face three-dimensional rebuilding method based on end-to-end convolutional neural networks |
CN110334701A (en) * | 2019-07-11 | 2019-10-15 | 郑州轻工业学院 | Collecting method based on deep learning and multi-vision visual under the twin environment of number |
CN111360862A (en) * | 2020-02-29 | 2020-07-03 | 华南理工大学 | Method for generating optimal grabbing pose based on convolutional neural network |
US20210312629A1 (en) * | 2020-04-07 | 2021-10-07 | Shanghai United Imaging Intelligence Co., Ltd. | Methods, systems and apparatus for processing medical chest images |
CN112365004A (en) * | 2020-11-27 | 2021-02-12 | 广东省科学院智能制造研究所 | Robot autonomous anomaly restoration skill learning method and system |
CN114723775A (en) * | 2021-01-04 | 2022-07-08 | 广州中国科学院先进技术研究所 | Robot grabbing system and method based on small sample learning |
CN114332209A (en) * | 2021-12-30 | 2022-04-12 | 华中科技大学 | Grabbing pose detection method and device based on lightweight convolutional neural network |
Non-Patent Citations (1)
Title |
---|
马倩倩 等: "轻量级卷积神经网络的机器人抓取检测研究" * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117549317A (en) * | 2024-01-12 | 2024-02-13 | 深圳威洛博机器人有限公司 | Robot grabbing and positioning method and system |
CN117549317B (en) * | 2024-01-12 | 2024-04-02 | 深圳威洛博机器人有限公司 | Robot grabbing and positioning method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114912287B (en) | Robot autonomous grabbing simulation system and method based on target 6D pose estimation | |
Huang et al. | A case study of cyber-physical system design: Autonomous pick-and-place robot | |
CN111331607B (en) | Automatic grabbing and stacking method and system based on mechanical arm | |
CN112621765B (en) | Automatic equipment assembly control method and device based on manipulator | |
CN113762159B (en) | Target grabbing detection method and system based on directional arrow model | |
JP2022187984A (en) | Grasping device using modularized neural network | |
JP2022187983A (en) | Network modularization to learn high dimensional robot tasks | |
CN115631401A (en) | Robot autonomous grabbing skill learning system and method based on visual perception | |
Tian et al. | Object grasping of humanoid robot based on YOLO | |
CN112947458A (en) | Robot accurate grabbing method based on multi-mode information and computer readable medium | |
CN112975957A (en) | Target extraction method, system, robot and storage medium | |
CN114131603B (en) | Deep reinforcement learning robot grabbing method based on perception enhancement and scene migration | |
Deng et al. | A human–robot collaboration method using a pose estimation network for robot learning of assembly manipulation trajectories from demonstration videos | |
Liu et al. | A novel camera fusion method based on switching scheme and occlusion-aware object detection for real-time robotic grasping | |
CN117733851A (en) | Automatic workpiece grabbing method based on visual detection | |
TWI788253B (en) | Adaptive mobile manipulation apparatus and method | |
Liu et al. | Visual servoing with deep learning and data augmentation for robotic manipulation | |
Sebbata et al. | An adaptive robotic grasping with a 2-finger gripper based on deep learning network | |
Zheng et al. | An intelligent robot sorting system by deep learning on RGB-D image | |
Grün et al. | Evaluation of domain randomization techniques for transfer learning | |
Chowdhury et al. | Comparison of neural network-based pose estimation approaches for mobile manipulation | |
Papon et al. | Martian fetch: Finding and retrieving sample-tubes on the surface of mars | |
Sun et al. | Precise grabbing of overlapping objects system based on end-to-end deep neural network | |
Hao et al. | Programming by visual demonstration for pick-and-place tasks using robot skills | |
Fu et al. | Robotic arm intelligent grasping system for garbage recycling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20230120 |