CN114782347A - Mechanical arm grabbing parameter estimation method based on attention mechanism generation type network - Google Patents
Mechanical arm grabbing parameter estimation method based on attention mechanism generation type network Download PDFInfo
- Publication number
- CN114782347A CN114782347A CN202210387024.1A CN202210387024A CN114782347A CN 114782347 A CN114782347 A CN 114782347A CN 202210387024 A CN202210387024 A CN 202210387024A CN 114782347 A CN114782347 A CN 114782347A
- Authority
- CN
- China
- Prior art keywords
- grabbing
- image
- dimensional
- mechanical arm
- attention mechanism
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1656—Programme controls characterised by programming, planning systems for manipulators
- B25J9/1664—Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1679—Programme controls characterised by the tasks executed
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
- G06T2207/10021—Stereoscopic video; Stereoscopic image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Mechanical Engineering (AREA)
- Robotics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a mechanical arm grabbing parameter estimation method based on an attention mechanism generating network. According to the method, the RGB-D camera is used for capturing corresponding operation scene images, the operation scene images are input into a trained attention-based mechanism generation type network, mechanical arm grabbing parameter information such as grabbing quality, grabbing angle, grabbing width and grabbing priority is obtained, and other grabbing parameters are screened through the grabbing priority, so that better mechanical arm grabbing parameters in a complex multi-object environment are obtained. The method not only can well solve the autonomous grabbing capacity of the mechanical arm in the complex stacking environment, but also further deepens the perception capacity of a visual system to effective information in the complex stacking environment through parameter estimation of grabbing priority, enhances the data processing capacity of the whole system to multi-dimensional information, and accordingly improves the grabbing precision of the mechanical arm in the complex stacking environment.
Description
Technical Field
The invention belongs to the field of mechanical arm grabbing control, and particularly relates to a mechanical arm grabbing parameter estimation method based on an attention mechanism generating network.
Background
Currently, in the research on the autonomous grasping of the mechanical arm, the grasping technology for a single object in a simple scene is mature; in practical situations, however, a plurality of objects often present a disordered stacking manner in a complex environment, which brings a greater challenge to the robotic arm autonomous grasping technology. The invention provides a mechanical arm grabbing parameter estimation method based on an attention mechanism generating network, which effectively estimates mechanical arm grabbing parameters through the attention mechanism generating network, deepens the perception capability of a visual system to effective information in a complex environment, improves the fusion capability of multi-channel information, realizes grabbing tasks of various objects in the complex environment, and successfully solves the problem of automatic grabbing of the mechanical arm in a multi-object stacking environment.
Disclosure of Invention
Aiming at the problem of autonomous grabbing of the mechanical arm in a complex stacking scene, the invention provides a mechanical arm grabbing parameter estimation method based on an attention mechanism generating network, so that the grabbing precision of autonomous grabbing of the mechanical arm in a complex stacking environment is improved.
In order to achieve the purpose, the invention adopts the main technical scheme that:
s1 use RGB-D camera to obtain mechanical armJob scene image in front state, including RGB image IrgbAnd depth image IdepthAnd a job scene reference coordinate system;
s2, inputting each operation scene image I into a trained neural network based on attention mechanism, generating a predicted value of a two-dimensional image group containing a motion instruction vector, wherein the predicted value of the two-dimensional image group at least contains one two-dimensional capture quality image GθA two-dimensional capture angle image AθOne two-dimensional captured width image WθAnd a two-dimensional capture priority image OθThe robot arm comprises a gripping success rate information, a gripping angle information, a clamping jaw opening width information and a gripping sequence information when the robot arm grips an object;
s3 two-dimensional capture quality image G in two-dimensional image group obtained by predictionθThe pixel values are sorted according to the size, n pixel points with the largest pixel values are selected, and the pixel points with the largest pixel values are predicted values with the highest grabbing success rateCorresponding to the two-dimensional angle image A according to the pixel point coordinates of the predicted valueθTwo-dimensional capture width image WθAnd two-dimensional capture priority image OθIn the method, a predicted value of the grabbing angle can be obtainedGrabbing width predicted valueAnd fetch sequence predicted valuesWherein p isnRepresenting two-dimensional capture quality image GθThe pixel value of (1) is arranged from large to small, and the pixel point coordinate of the nth pixel is arranged;
s4 pair fetch sequence prediction valueSorting, selecting the priority of the grabbing sequence with the highest priorityHigh predicted valueThenThe grabbing information corresponding to the pixel coordinates is the optimal motion instruction vector, i.e.
And S5, analyzing the obtained optimal motion instruction vector to obtain the grabbing coordinate, the grabbing angle and the grabbing width of the target object to be grabbed under the base coordinate system of the mechanical arm, namely the grabbing parameters of the mechanical arm.
Preferably, step S2 includes:
s21 training the generated neural network based on the attention mechanism based on the existing data set;
s22, respectively preprocessing the job scene images to obtain 300 × 300 pixel job scene images;
s23, inputting the image feature vector into a trained attention-based generating neural network;
s24 outputs the two-dimensional image group prediction value including the motion instruction vector, that is, the parameter prediction value of the capturing success probability, to the attention-based generative neural network.
Preferably, step S3 includes:
s31, sorting the pixel values of the two-dimensional grabbing quality images in the predicted two-dimensional image group, namely, the size of each pixel value in the two-dimensional grabbing quality images represents the grabbing success rate of the mechanical arm taking the point as the center;
s32, selecting n coordinates of the maximum grabbing success rate predicted values as grabbing center coordinates;
s33, according to the predicted two-dimensional image group, obtaining the grabbing angle pixel value, grabbing width pixel value and grabbing priority pixel value corresponding to the grabbing center coordinate,
s34 analyzes the information of the grabbing angle, the grabbing width and the grabbing sequence according to the grabbing angle pixel value, the grabbing width pixel value and the grabbing priority pixel value.
Preferably, step S5 includes:
s51, analyzing the obtained optimal motion instruction vector;
s52, carrying out coordinate conversion on the analyzed data through a wrist camera coordinate system;
s53, transforming the coordinates of the converted camera coordinate system through the mechanical arm base coordinate system;
and S54, inputting the coordinate obtained by converting the mechanical arm base coordinate system, the grabbing width and the grabbing angle information into the mechanical arm control system for grabbing.
Preferably, the training method of the attention-based generative neural network comprises the following steps:
s01 creating a data set G for training a network based on the existing data settrain(ii) a The G istrainThe data set comprises a work scene image, effective capture frame information and a segmentation image only containing the topmost object, wherein GtrainThe segmented image only containing the topmost object is a two-dimensional capture priority image;
s02, mapping the effective grabbing frame information to a 300X 300 two-dimensional image to obtain a two-dimensional grabbing quality image, a two-dimensional grabbing angle image and a two-dimensional grabbing width image, and combining GtrainConstructing a two-dimensional image group by only containing the segmented image of the topmost object;
s03, constructing a generating neural network based on the attention mechanism by constructing an attention mechanism module;
s04 Using data set GtrainAnd the two-dimensional image group trains the generated neural network based on the attention mechanism, RGB-D images without capture information are input, a two-dimensional image group containing capture information is output, and the trained generated neural network based on the attention mechanism is obtained.
Preferably, the attention-based generative neural network comprises:
a feature extraction part, an attention mechanism part and a network generation part;
a feature extraction section:
the feature extraction network consists of a convolution layer with convolution kernel size of 9 multiplied by 9 and two convolution layers with convolution kernel size of 4 multiplied by 4, and each convolution layer is followed by a Batch Normalization layer and a Rectified Linear Unit active layer at this stage;
cutting out RGB image I with size of 300 × 300rgbAnd depth image IdepthPerforming feature fusion to obtain a fusion feature map IfusionA first reaction offusionInputting a feature extraction network, and performing feature extraction to obtain a feature map Ioutput1;
Attention mechanism part:
the attention mechanism network consists of five attention modules, wherein each module consists of a residual error part, a Squeeze part and an Excitation part;
the residual part is divided into direct mapping and residual mapping, the direct mapping is checked by convolution kernel of 1 × 1output1Performing convolution operation to obtain direct mapping result h (I)output1) The residual mapping is composed of two convolution layers with convolution kernel size of 3 × 3, a Batch Normalization layer is next to the two convolution layers, a Rectified Linear Unit active layer is next to the first Batch Normalization layer, Ioutput1Obtaining R (I) after residual mappingoutput1);
The Squeeze part is realized by introducing Global Average firing, and the role is to obtain the Global information embedding of each channel of the feature map, namely the feature vector; suppose ucIs the characteristic diagram with W multiplied by H and the channel is C, the characteristic diagram after Squeeze is zc;
The Excitation moiety being via zcLearning the weight of each channel, and forming by a door mechanism of two fully-connected layers; gating cell scThe size of the glass is 1 multiplied by 1,the number of channels is C, then scThe calculation method of (a) is expressed as:
sc=Fex(zc,w)=σ(g(z,w))=σ(w2δ(w1zc)) (2)
wherein sigma is a sigmoid activation function, delta is a ReLU activation function,gamma is the number of nodes of the hidden layer;
R (I)output1) Sequentially inputting the Squeeze part and the Excitation part to obtain the predicted values passing through the Squeeze part and the Excitation partWill be provided withSplicing the characteristic graph obtained by the residual error part to obtain the output I of the attention moduleoutput2;
Will Ioutput1The output of the attention mechanism network part can be obtained by inputting 5 linearly connected five attention modules
Generating a network part:
the generation network consists of two deconvolution layers with convolution kernel sizes of 4 multiplied by 4 and a deconvolution layer with convolution kernel sizes of 9 multiplied by 9, wherein the two deconvolution layers with convolution kernel sizes of 4 multiplied by 4 are both followed by a Batch Normalization layer and a Rectified Linear Unit active layer;
will be provided withInputting the generated network to obtain the predicted value of the two-dimensional image group containing the motion instruction vector.
The invention has the following beneficial effects:
the invention provides a mechanical arm grabbing parameter estimation method based on an attention mechanism generating network, which can grab various unknown objects in a complex unstructured environment. The sensing capability of the mechanical arm in a complex environment is improved by fusing multiple information channels such as a color image, a depth image and the like; the construction of the lightweight attention generation network ensures the real-time property of the mechanical arm when the mechanical arm grabs the object; the establishment of the grabbing priority improves the effective grabbing precision of the mechanical arm.
Drawings
FIG. 1 is a diagram illustrating an example of a neural network structure generated based on attention mechanism according to an embodiment of the present invention;
fig. 2 is a frame diagram of a robot grasping parameter learning system based on an attention mechanism generating network according to an embodiment of the present invention.
Detailed Description
For a better understanding of the present invention, reference will now be made in detail to the present invention, by way of example, which is illustrated in the accompanying drawings of FIG. 1. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention. The present invention will be described in further detail with reference to the following embodiments:
s1, acquiring the work scene image of the mechanical arm in the current state by using the RGB-D camera, wherein the work scene image comprises an RGB image IrgbAnd depth image IdepthAnd an operating scene reference coordinate system;
s2, inputting each operation scene image I into a trained attention-based generating neural network in sequence to generate a predicted value of a two-dimensional image group containing a motion instruction vector, wherein the two-dimensional image group at least contains one two-dimensional capture quality image GθA two-dimensional angle image AθOne two-dimensional captured width image WθAnd a two-dimensional capture priority image OθThe method comprises the following steps of respectively including grabbing success rate information, grabbing angle information, clamping jaw opening width information and grabbing sequence information when a mechanical arm grabs an object;
the attention mechanism-based generating neural network comprises a feature extraction part, an attention mechanism part and a generating network part;
a feature extraction section:
the feature extraction network consists of a convolution layer with convolution kernel size of 9 multiplied by 9 and two convolution layers with convolution kernel size of 4 multiplied by 4, and each convolution layer is followed by a Batch Normalization layer and a Rectified Linear Unit activation layer at the stage;
will cut out 300X 300 RGB image IrgbAnd depth image IdepthPerforming feature fusion to obtain a fusion feature map IfusionIs shown byfusionInputting a feature extraction network, and performing feature extraction to obtain a feature map Ioutput1。
Attention mechanism part:
the attention mechanism network consists of five attention modules, wherein each module consists of a residual error part, a Squeeze part and an Excitation part;
the residual part can be divided into two parts of direct mapping and residual mapping, the direct mapping is checked by a convolution kernel of 1 × 1 to check Ioutput1Performing convolution operation to obtain direct mapping result h (I)output1) The residual mapping is composed of two convolution layers with convolution kernel size of 3 x 3, a Batch Normalization layer is next to the two convolution layers, a Rectified Linear Unit active layer is next to the first Batch Normalization layer, Ioutput1Obtaining R (I) after residual mappingoutput1);
The Squeeze part is realized by introducing Global Average Potential (GAP) and the role is to obtain the Global information embedding of each channel of the feature map, namely the feature vector. Suppose ucIs the characteristic diagram with W multiplied by H and the channel is C, the characteristic diagram after Squeeze is zc;
The Excitation moiety being via zcThe learned weight of each channel is composed of a gate mechanism (gate mechanism) of two fully connected layers. Gating unit scIs a feature vector with the size of 1 multiplied by 1 and the number of channels C, then scThe calculation method of (a) is expressed as:
sc=Fex(zc,w)=σ(g(z,w))=σ(w2δ(w1zc)) (2)
wherein sigma is a sigmoid activation function, delta is a ReLU activation function,gamma is the number of nodes of the hidden layer;
R (I)output1) Sequentially inputting the Squeeze part and the Excitation part to obtain the predicted values passing through the Squeeze part and the Excitation partWill be provided withSplicing the characteristic graph obtained by the residual error part to obtain the output I of the attention moduleoutput2;
Will Ioutput1The output of the attention mechanism network part can be obtained by inputting 5 linearly connected five attention modules
Generating a network part:
the generation network consists of two deconvolution layers with convolution kernel sizes of 4 multiplied by 4 and a deconvolution layer with convolution kernel sizes of 9 multiplied by 9, wherein the two deconvolution layers with convolution kernel sizes of 4 multiplied by 4 are both followed by a Batch Normalization layer and a Rectified Linear Unit active layer;
will be provided withAnd inputting the generation network to obtain the predicted value of the two-dimensional image group containing the motion instruction vector.
S3 two-dimensional capture quality image G in two-dimensional image group obtained by predictionθThe pixel values are sorted according to the size, ten pixel points with the largest pixel value are selected, and the pixel points are predicted values with the highest grabbing success rateCorresponding to the two-dimensional angle image A according to the pixel point coordinates of the predicted valueθTwo-dimensional capture width image WθAnd two-dimensional capture priority image OθIn the method, a predicted value of the grabbing angle can be obtainedGrabbing width predicted valueAnd fetch sequence predicted values
S4 pair grabbing sequence predicted valueSorting, selecting the predicted value with the highest priority of the grabbing sequenceThenThe capture information corresponding to the pixel coordinates is the optimal motion instruction vector, i.e.
And S5, analyzing the obtained optimal motion instruction vector, and obtaining the grabbing coordinate, the grabbing angle and the grabbing width of the target object to be grabbed in the mechanical arm base coordinate system, namely the mechanical arm grabbing parameters, through coordinate transformation of the wrist camera and coordinate transformation between the mechanical arm wrist and the base after analysis.
S6 repeats steps S1-S5 until all objects are grabbed.
As a specific preference for implementing the technical solution of the present invention, before step S2, the method comprises:
s01 creating a data set G for training a network based on the existing data settrain(ii) a The G istrainThe data set comprises a working scene image, effective capture frame information and a segmentation image only containing a topmost object;
s02, mapping the effective grabbing frame information to a 300X 300 two-dimensional image to obtain a two-dimensional grabbing quality image, a two-dimensional grabbing angle image and a two-dimensional grabbing width image, and combining GtrainThe image group is divided into two-dimensional image groups, wherein the two-dimensional image groups only contain the top-layer object;
s03, constructing a generating neural network based on the attention mechanism by constructing an attention mechanism module;
s04 Using data set GtrainTraining the generated neural network based on the attention mechanism by the two-dimensional image group, inputting an RGB-D image without grabbing information, outputting a two-dimensional image group containing grabbing information, and obtaining the trained generated neural network based on the attention mechanism;
as a specific preferred implementation of the technical solution of the present invention, as shown in fig. 2, a robot grasping parameter learning system based on an attention mechanism generating network includes:
offline learning, by data set GtrainContinuously training a generative neural network based on an attention mechanism, thereby obtaining a mechanical arm grabbing parameter prediction model;
and in the online learning, the operation scene image under the actual condition is acquired through the field perception of the actual operation scene and is input into the mechanical arm grabbing parameter prediction model, and the grabbing parameters of the mechanical arm under the actual scene are acquired, so that the grabbing of the mechanical arm under the actual scene is realized.
It should be understood that the above description of specific embodiments of the present invention is only for the purpose of illustrating the technical lines and features of the present invention, and is intended to enable those skilled in the art to understand the contents of the present invention and to implement the present invention, but the present invention is not limited to the above specific embodiments. It is intended that all such changes and modifications as fall within the scope of the appended claims be embraced therein.
Claims (6)
1. The mechanical arm grabbing parameter estimation method based on the attention mechanism generation type network is characterized by comprising the following steps:
s1, acquiring the work scene image of the mechanical arm in the current state by using the RGB-D camera, wherein the work scene image comprises an RGB image IrgbAnd depth image IdepthAnd a job scene reference coordinate system;
s2, inputting each operation scene image I into trained neural network based on attention mechanism to generate predicted value of two-dimensional image group containing motion instruction vectorThe predicted value of the two-dimensional image group at least comprises a two-dimensional grabbing quality image GθA two-dimensional capture angle image AθOne two-dimensional captured width image WθAnd a two-dimensional capture priority image OθThe robot arm comprises a gripping success rate information, a gripping angle information, a clamping jaw opening width information and a gripping sequence information when the robot arm grips an object;
s3 matching the predicted two-dimensional image group with the two-dimensional grabbing quality image GθThe pixel values are sorted according to size, n pixel points with the maximum pixel value are selected, and the pixel points are predicted values with the highest capturing success rateCorresponding to the two-dimensional angle image A according to the pixel point coordinates of the predicted valueθTwo-dimensional capture width image WθAnd two-dimensional capture priority image OθIn the method, a predicted value of the grabbing angle can be obtainedGrabbing width predicted valueAnd fetch sequence predicted valuesWherein p isnRepresenting two-dimensional grab-quality images GθThe pixel value of (1) is arranged from large to small, and the pixel point coordinate of the nth pixel is arranged;
s4 pair fetch sequence prediction valueSorting, selecting the predicted value with the highest priority of the grabbing sequenceThenThe grabbing information corresponding to the pixel coordinates is the optimal motion instruction vector, i.e.
And S5, analyzing the obtained optimal motion instruction vector to obtain the grabbing coordinates, the grabbing angle and the grabbing width of the target object to be grabbed under the mechanical arm base coordinate system, namely the grabbing parameters of the mechanical arm.
2. The method for estimating manipulator grasping parameters based on attention mechanism generating network according to claim 1, wherein step S2 comprises:
s21 training the generated neural network based on the attention mechanism based on the existing data set;
s22, respectively preprocessing the operation scene images to obtain 300 x 300 pixel operation scene images;
s23, inputting the image feature vector into a trained attention-based generating neural network;
s24 outputs the two-dimensional image group prediction value including the motion instruction vector, that is, the parameter prediction value of the capturing success probability, to the attention-based generative neural network.
3. The method for estimating manipulator grasping parameters based on an attention mechanism generating network according to claim 1 or 2, wherein step S3 includes:
s31, sorting the pixel values of the two-dimensional grabbing quality images in the predicted two-dimensional image group, namely, the size of each pixel value in the two-dimensional grabbing quality images represents the grabbing success rate of the mechanical arm taking the point as the center;
s32, selecting n coordinates of the predicted values of the maximum grabbing success rate as grabbing center coordinates;
s33, according to the predicted two-dimensional image group, obtaining the grabbing angle pixel value, grabbing width pixel value and grabbing priority pixel value corresponding to the grabbing center coordinate,
s34 analyzes the information of the grabbing angle, the grabbing width and the grabbing sequence according to the grabbing angle pixel value, the grabbing width pixel value and the grabbing priority pixel value.
4. The method for estimating grabbing parameters of a mechanical arm based on an attention mechanism generating network as claimed in any one of claims 1 to 3, wherein step S5 includes:
s51, analyzing the obtained optimal motion instruction vector;
s52, carrying out coordinate conversion on the analyzed data through a wrist camera coordinate system;
s53, transforming the coordinates of the converted camera coordinate system through the mechanical arm base coordinate system;
and S54, inputting the coordinate obtained by converting the mechanical arm base coordinate system, the grabbing width and the grabbing angle information into the mechanical arm control system for grabbing.
5. The mechanical arm grabbing parameter estimation method based on the attention mechanism generation network as claimed in claim 2, wherein: the attention mechanism-based training method of the generative neural network comprises the following steps:
s01 creating a data set G for training a network based on the existing data settrain(ii) a The G istrainThe data set comprises a working scene image, effective capture frame information and a segmentation image only containing the topmost object, wherein GtrainThe segmented image only containing the topmost object is a two-dimensional capture priority image;
s02, mapping the effective grabbing frame information to a 300X 300 two-dimensional image to obtain a two-dimensional grabbing quality image, a two-dimensional grabbing angle image and a two-dimensional grabbing width image, and combining GtrainConstructing a two-dimensional image group by only containing the segmented image of the topmost object;
s03, constructing a generating neural network based on the attention mechanism by constructing an attention mechanism module;
s04 Using data set GtrainGenerating neural network based on attention mechanism to be paired with two-dimensional imageAnd performing training, namely inputting an RGB-D image without grabbing information, outputting a two-dimensional image group containing grabbing information, and obtaining a trained attention-based generating neural network.
6. The method for estimating manipulator grasping parameters based on attention mechanism generating network according to claim 5, wherein the attention mechanism generating neural network comprises:
a feature extraction part, an attention mechanism part and a network generation part;
a feature extraction part:
the feature extraction network consists of a convolution layer with convolution kernel size of 9 multiplied by 9 and two convolution layers with convolution kernel size of 4 multiplied by 4, and each convolution layer is followed by a Batch Normalization layer and a Rectified Linear Unit activation layer at the stage;
will cut out 300X 300 RGB image IrgbAnd depth image IdepthPerforming feature fusion to obtain a fusion feature map IfusionA first reaction offusionInputting a feature extraction network, and performing feature extraction to obtain a feature map Ioutput1;
Attention mechanism part:
the attention mechanism network consists of five attention modules, wherein each module consists of a residual error part, a Squeeze part and an Excitation part;
the residual part is divided into direct mapping and residual mapping, the direct mapping is checked by convolution kernel of 1 × 1output1Performing convolution operation to obtain direct mapping result h (I)output1) The residual mapping is composed of two convolution layers with convolution kernel size of 3 x 3, a Batch Normalization layer is next to the two convolution layers, a Rectified Linear Unit active layer is next to the first Batch Normalization layer, Ioutput1Obtaining R (I) after residual mappingoutput1);
The Squeeze part is realized by introducing globalage Pooling, and the role is to obtain the global information embedding of each channel of the feature map, namely the feature vector; suppose ucIs a feature map with W × H size and C channel, the feature map after Squeeze is zc;
The Excitation moiety being via zcLearning the weight of each channel, wherein the weight is formed by door mechanisms of two fully-connected layers; gating cell scIs a feature vector with the size of 1 multiplied by 1 and the number of channels C, then scThe calculation method of (a) is expressed as:
sc=Fex(zc,w)=σ(g(z,w))=σ(w2δ(w1zc)) (2)
wherein, sigma is sigmoid activation function, delta is ReLU activation function,gamma is the number of nodes of the hidden layer;
R (I)output1) Sequentially inputting the Squeeze part and the Excitation part to obtain the predicted values passing through the Squeeze part and the Excitation partWill be provided withSplicing the characteristic graph obtained by the residual error part to obtain the output I of the attention moduleoutput2;
Will Ioutput1The output of the attention mechanism network part can be obtained by inputting 5 linearly connected five attention modules
Generating a network part:
the generation network consists of two deconvolution layers with convolution kernel sizes of 4 multiplied by 4 and a deconvolution layer with convolution kernel sizes of 9 multiplied by 9, wherein the two deconvolution layers with convolution kernel sizes of 4 multiplied by 4 are both followed by a Batch Normalization layer and a Rectified Linear Unit active layer;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210387024.1A CN114782347A (en) | 2022-04-13 | 2022-04-13 | Mechanical arm grabbing parameter estimation method based on attention mechanism generation type network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210387024.1A CN114782347A (en) | 2022-04-13 | 2022-04-13 | Mechanical arm grabbing parameter estimation method based on attention mechanism generation type network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114782347A true CN114782347A (en) | 2022-07-22 |
Family
ID=82429469
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210387024.1A Pending CN114782347A (en) | 2022-04-13 | 2022-04-13 | Mechanical arm grabbing parameter estimation method based on attention mechanism generation type network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114782347A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115972198A (en) * | 2022-12-05 | 2023-04-18 | 无锡宇辉信息技术有限公司 | Mechanical arm visual grabbing method and device under incomplete information condition |
CN117549307A (en) * | 2023-12-15 | 2024-02-13 | 安徽大学 | Robot vision grabbing method and system in unstructured environment |
-
2022
- 2022-04-13 CN CN202210387024.1A patent/CN114782347A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115972198A (en) * | 2022-12-05 | 2023-04-18 | 无锡宇辉信息技术有限公司 | Mechanical arm visual grabbing method and device under incomplete information condition |
CN115972198B (en) * | 2022-12-05 | 2023-10-10 | 无锡宇辉信息技术有限公司 | Mechanical arm visual grabbing method and device under incomplete information condition |
CN117549307A (en) * | 2023-12-15 | 2024-02-13 | 安徽大学 | Robot vision grabbing method and system in unstructured environment |
CN117549307B (en) * | 2023-12-15 | 2024-04-16 | 安徽大学 | Robot vision grabbing method and system in unstructured environment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108280856B (en) | Unknown object grabbing pose estimation method based on mixed information input network model | |
Liu et al. | Deep learning-based human motion prediction considering context awareness for human-robot collaboration in manufacturing | |
CN114782347A (en) | Mechanical arm grabbing parameter estimation method based on attention mechanism generation type network | |
CN111046948B (en) | Point cloud simulation and deep learning workpiece pose identification and robot feeding method | |
CN105772407A (en) | Waste classification robot based on image recognition technology | |
CN110238840B (en) | Mechanical arm autonomous grabbing method based on vision | |
CN113172629B (en) | Object grabbing method based on time sequence tactile data processing | |
Dai | Real-time and accurate object detection on edge device with TensorFlow Lite | |
CN113762159B (en) | Target grabbing detection method and system based on directional arrow model | |
CN114140418A (en) | Seven-degree-of-freedom grabbing posture detection method based on RGB image and depth image | |
CN112613478B (en) | Data active selection method for robot grabbing | |
CN110171001A (en) | A kind of intelligent sorting machinery arm system based on CornerNet and crawl control method | |
CN114998573B (en) | Grabbing pose detection method based on RGB-D feature depth fusion | |
CN115147488A (en) | Workpiece pose estimation method based on intensive prediction and grasping system | |
CN117523181B (en) | Multi-scale object grabbing point detection method and system based on unstructured scene | |
Jiang et al. | Robotic grasp detection using light-weight cnn model | |
CN113681552B (en) | Five-dimensional grabbing method for robot hybrid object based on cascade neural network | |
CN117340895A (en) | Mechanical arm 6-DOF autonomous grabbing method based on target detection | |
Permana et al. | Hand movement identification using single-stream spatial convolutional neural networks | |
CN115861956A (en) | Yolov3 road garbage detection method based on decoupling head | |
Geng et al. | A Novel Real-time Grasping Method Cobimbed with YOLO and GDFCN | |
CN114882214A (en) | Method for predicting object grabbing sequence from image based on deep learning | |
Gu et al. | Cooperative Grasp Detection using Convolutional Neural Network | |
Kumar et al. | Employing data augmentation for recognition of hand gestures using deep learning | |
CN117772648B (en) | Part sorting processing method, device, equipment and medium based on body intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |