CN116128960A

CN116128960A - Automatic workpiece grabbing method, system and device based on machine learning

Info

Publication number: CN116128960A
Application number: CN202310253672.2A
Authority: CN
Inventors: 梁建安; 王虹鑫; 李凯; 池家聪; 张旭; 宋苗苗
Original assignee: Shanxi University
Current assignee: Shanxi University
Priority date: 2021-09-17
Filing date: 2021-09-17
Publication date: 2023-05-16
Also published as: CN113808197A

Abstract

The invention belongs to the technical field of machine learning, image recognition and binocular vision, and discloses a method, a system and a device for automatically grabbing workpieces based on machine learning, aiming at the problems that the existing automatic grabbing of workpieces is complex in system composition, inaccurate in workpiece category recognition, difficult to expand the types of grabbed targets, inefficient, inaccurate and the like in the application process. The device comprises a system and a mechanical arm control assembly, wherein the system comprises a binocular vision assembly, a data preprocessing frame, a deep convolutional neural network and an information flow; according to the method, target detection information output by the deep convolutional neural network is fused with image information of the binocular vision component through an information flow, the complexity of the system is reduced, the data set is randomly extracted by the data preprocessing frame to form a training set and a verification set, the management of the workpiece library types is completed, the recognition of multiple types of workpieces is realized by the deep neural network, the system is more efficient, easier to expand and more accurate to classify and position, and the automatic grabbing task of the workpieces is completed.

Description

Automatic workpiece grabbing method, system and device based on machine learning

The application is a divisional application of a patent application named as an automatic workpiece grabbing system and method based on machine learning, and the application date of the original application is 2021, 09, 17 and 202111095314.0.

Technical Field

The invention belongs to the technical field of machine learning, image recognition and binocular vision, and particularly relates to a workpiece automatic grabbing method, system and device based on machine learning.

Background

The task of the automatic workpiece grabbing technology is to finish the identification of the tools with the specified types, provide depth information and grabbing point information of target workpieces for a mechanical grabbing arm, effectively identify all workpiece types in a workpiece library and manage the identified types as required.

At present, the automatic workpiece grabbing technology mainly relies on means such as target feature extraction and feature matching to classify and position, and meanwhile, the measurement of depth information of a workpiece generally adopts a laser ranging or multi-graph triangulation mode, so that the following defects exist:

firstly, it is: in the partial automatic grabbing technology, an active visual scene measurement method is adopted for the spatial positioning of a workpiece, a visual component realizes the construction of a three-dimensional space by emitting laser outwards, and a point cloud image formed by a laser sensor is utilized to acquire the spatial positioning of the workpiece, so that a system is complex and the hardware requirement is high;

secondly, it is: the target detection of the workpiece adopts a characteristic matching and lightweight neural network mode, so that the effect of target classification and positioning is poor, the type expansion of the identifiable workpiece is difficult, and the multi-target detection capability is weak;

thirdly, the method comprises the following steps: the gripping of workpieces is generally performed by a geometric guiding method, so that the gripping efficiency of the workpieces is low under a dynamic condition.

Disclosure of Invention

Aiming at the problems that the existing automatic workpiece grabbing is complex in system composition, inaccurate in workpiece category identification, difficult to expand the type of the grabbed target, inefficient, inaccurate and the like in the application process, the invention provides an automatic workpiece grabbing method, an automatic workpiece grabbing system and an automatic workpiece grabbing device based on machine learning.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

the invention provides a machine learning-based automatic workpiece grabbing method, which comprises the following steps of:

step 1, image acquisition and pretreatment;

collecting multiple gesture images of a workpiece in different environments; obtaining a data set and a label according to the gesture image;

step 2, training a deep convolutional neural network Net 1;

training a deep convolutional neural network by adopting the data set and the label;

obtaining parameters of the deep convolutional neural network by taking the minimum value of the loss function of the deep convolutional neural network as a target;

updating the weight of the deep convolutional neural network by adopting an Adam optimizer to obtain an updated weight;

calculating a loss variation value of the loss function according to the updated weight and the parameter;

if the loss change value is smaller than a set threshold value, taking the depth convolution neural network corresponding to the loss value as a converged depth convolution neural network;

step 3, adopting K-fold to cross and verify the weight of the deep convolutional neural network Net1 obtained by training in the step 2;

step 4, detecting a workpiece target in real time;

inputting the real-time video image of the workpiece into the depth convolution neural network Net1 verified in the step 3, and outputting target detection information;

step 5, performing space positioning on the workpiece;

calculating spatial depth information according to the target detection information;

step 6, completing a grabbing task on the corresponding workpiece;

and grabbing corresponding workpieces in real time according to the space depth information.

Optionally, the specific process of image acquisition in the step 1 is: the binocular vision assembly is provided with a left camera and a right camera, and the left camera is used for collecting images of various postures of a workpiece in different environments.

Optionally, the preprocessing includes:

marking different targets in the acquired gesture images by adopting an image segmentation method to obtain a data set and corresponding labels; the label is workpiece category information; the data set is an image acquired by the type of the workpiece library; the data set includes a training set and a validation set.

Optionally, the input layer of the deep convolutional neural network Net1 is a visible light image, the size of the image is W.H, various types of convolutional kernels are adopted to carry out convolutional operation, the characteristic image is downsampled by adjusting the convolutional step length, the excitation function is a ReLU function, and the target detection is completed under three scales through the characteristic pooling pyramid FPN network and the convolutional layer; the process may map the image to a one-dimensional vector of object detection information.

Optionally, the loss function is:

L _c (o,c,O,C,l,g)＝λ ₁ L _conf (o,c)+λ ₂ L _cla (O,C)+λ ₃ L _loc (l,g)

wherein L is _conf (o, c) is confidenceLoss of degree; l (L) _cla (O, C) is a classification loss; l (L) _loc (l, g) is a loss of positioning; lambda (lambda) ₁ A balance coefficient that is a confidence loss; lambda (lambda) ₂ Balance coefficient for classification loss; lambda (lambda) ₃ To locate the balance of the losses.

Optionally, updating the weight of the deep convolutional neural network by adopting an Adam optimizer to obtain an updated weight, which specifically comprises:

m _t ＝β ₁ ·m _t-1 +(1-β ₁ )·g(ω _t )

ν _t ＝β ₂ ·ν _t-1 +(1-β ₂ )·g(ω _t )·g(ω _t )

wherein m is _t To weight first order momentum, m _t-1 Is the first order momentum before weight update, beta ₁ (0.9)、β ₂ (0.999) to control the attenuation coefficient, g (ω) _t ) For time t to parameter omega _t Is a loss gradient of v _t Is the second order momentum, v _t-1 Is the second order momentum before updating the network weight, alpha is the learning rate, epsilon (10 ^-7 ) To prevent the denominator from being zero decimal, ω _t Is the weight parameter before updating omega _t+1 Is the weight parameter after the update of the weight parameter,

all are calculated intermediate values. />

Optionally, calculating spatial depth information according to the target detection information specifically includes:

and processing by using a correlation matching algorithm and a triangulation principle, and calculating the spatial depth information of the workpiece.

The invention also provides a machine learning-based workpiece automatic grabbing system, which comprises a binocular vision component, a data preprocessing frame, a deep convolutional neural network and an information flow;

the binocular vision assembly is provided with a left high-definition camera and a right high-definition camera and is used for collecting images of a workpiece in various postures under different environments;

the data preprocessing framework is used for:

image data integration is carried out on multiple types of workpieces, and an image data set is established for each type of workpiece by using a manual labeling mode;

randomly extracting the data set to form a training set and a verification set, and managing the class of the workpiece library;

the deep convolutional neural network detects a workpiece target;

the information flow fuses the target detection information obtained by the depth convolution neural network with the image information of the left camera and the right camera of the binocular vision component, and performs space positioning on a workpiece to obtain space depth information; the spatial depth information is used to enable the corresponding workpiece to be grasped in real time.

The invention also provides a machine learning-based automatic workpiece grabbing device, which comprises a mechanical arm control assembly and a machine learning-based automatic workpiece grabbing system;

the mechanical arm control assembly is used for grabbing corresponding workpieces in real time according to the space depth information.

The information flow applies the target detection information obtained by the deep convolutional neural network to the spatial positioning of the binocular vision component on the workpiece, solves the problems that the workpiece type identification is inaccurate, the grabbed target type is difficult to expand, the efficiency is low, the accuracy is poor and the like, fuses the target detection information output by the deep convolutional neural network with the image information of the left camera and the right camera of the binocular vision component through the information flow, successfully reduces the complexity of the workpiece grabbing system, and enables the system to be more suitable for complex conditions.

Compared with the prior art, the invention has the following advantages:

1. the invention provides a machine learning-based workpiece automatic grabbing system, which comprises a binocular vision component, a data preprocessing frame, a deep convolutional neural network and an information flow, wherein the binocular vision component is used for preprocessing data; according to the system, the target detection information output by the deep convolutional neural network is fused with the image information of the left camera and the right camera of the binocular vision component through the information flow, so that the complexity of the workpiece grabbing system is successfully reduced, and the system can be more suitable for complex conditions.

2. The invention provides a machine learning-based automatic workpiece grabbing method, which adopts a method of combining a data preprocessing frame and a deep convolutional neural network to use: the data prediction management framework completes random extraction of the data set to form a training set and a verification set, and simultaneously completes management of the class of the workpiece library, and the deep neural network realizes identification of multiple classes of workpieces. The automatic grabbing task of the workpiece can be completed with higher efficiency, a system which is easier to expand and more accurate classification and positioning.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of the method of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

A machine learning-based automatic workpiece grabbing system comprises a binocular vision component, a data preprocessing frame, a deep convolutional neural network and an information flow.

The binocular vision assembly is provided with left and right high-definition cameras and is used for collecting images of a workpiece in different environments and in various postures.

The data preprocessing framework firstly integrates image data for multiple types of workpieces, establishes image data sets for various workpieces by using a manual labeling mode, and simultaneously enables the deep neural network to successfully realize accurate identification of the multiple types of workpieces, the data preprocessing framework can complete random extraction of the data sets to form training sets and verification sets, and the framework can complete management of workpiece library types.

And the deep convolution neural network performs workpiece target detection.

The information flow applies the target detection information obtained by the deep convolutional neural network to the spatial positioning of the binocular vision component on the workpiece, and solves the problems that the workpiece type identification is inaccurate, the grabbed target type is difficult to expand, the efficiency is low, the accuracy is low and the like.

The invention is described with reference to fig. 1, in which the process of spatial positioning of a workpiece by a vision component in an automatic workpiece grabbing system is resolved into the detection problem of a workpiece target in a real-time video and the acquisition problem of depth information of the workpiece by a binocular vision component, the real-time target detection is performed on the workpiece in the real-time video of a left camera by constructing a depth convolution neural network, the detection information is transmitted to the binocular vision component through an information flow, and the component combines the target detection information, performs correlation matching and triangulation on the real-time video images of the left and right cameras, and outputs the real-time spatial position of the workpiece to be detected in real time. In particular to the construction of a deep neural network, the optimization of network training and the spatial positioning of workpieces.

(1) A deep neural network for multi-type workpiece recognition is trained, the network is denoted as Net1, and the network type is a convolutional neural network.

(1-1) setting an identification network Net1: the structure of the network is a CNN network, the input layer is a visible light image, the size of the image is W.H, convolution operation is carried out by adopting various convolution kernels, the characteristic image is downsampled by adjusting the convolution step length, the excitation function is a ReLU function, the target detection is completed under three scales by the characteristic pooling pyramid FPN network and the convolution layer, and the physical meaning of the process is that the image is mapped into a target detection information one-dimensional vector.

(1-2) the binocular vision component is provided with a left camera and a right camera, the left camera is used for collecting images of various postures of a workpiece in different environments, and different targets are marked by adopting an image segmentation method. Resulting in a dataset d= { D ₁ ,d ₂ ,...d _n Corresponding tag s= { [ S } ₁₁ ,s ₁₂ ...],[s ₂₁ ,s ₂₂ ...],...[s _n1 ,s _n2 ...]}。

(1-3) defining a Net1 deep convolutional neural network loss function of L for real classification and localization of workpiece targets _c (o,c,O,C,l,g)＝λ ₁ L _conf (o,c)+λ ₂ L _cla (O,C)+λ ₃ L _loc (l, g) wherein the three components are confidence, classification and location loss, lambda, respectively ₁ 、λ ₂ 、λ ₃ Is the balance coefficient between the three partial losses. Training convergence to deep neural networks to L _c For the purpose of minimization, parameters that optimize the convolutional neural network (minimizing the gradient of the loss function to the convolutional neural network parameters) are calculated.

(1-4) updating the weight of the convolutional neural network by adopting an Adam optimizer, wherein the Adam optimizer can adapt to the learning rate, and the parameter updating effect is further improved:

m _t ＝β ₁ ·m _t-1 +(1-β ₁ )·g(ω _t )

ν _t ＝β ₂ ·ν _t-1 +(1-β ₂ )·g(ω _t )·g(ω _t )

all are calculated intermediate values.

(1-5) repeating steps 1-3 to 1-4, wherein the loss function variation is smaller than the threshold eta after N continuous optimizations, and the Net1 network converges.

(2) And optimizing the weight of the iterative deep neural network by adopting a K-fold cross validation strategy.

(3) Under the dynamic condition, inputting a real-time video image acquired by a left camera of the binocular vision component into a depth convolutional neural network Net1, detecting a real-time target of the environment where the workpiece is located, and outputting target detection information.

(4) The binocular vision component acquires target detection information of a workpiece to be grabbed through an information transmission flow, and image information acquired by the left camera and the right camera of the binocular vision component is processed by utilizing a correlation matching algorithm and a triangulation principle to calculate the spatial depth information of the workpiece.

(5) And in a state of real-time detection, the space depth information of the workpiece is transmitted to the mechanical arm control assembly through an information transmission flow, so that real-time grabbing of the corresponding workpiece is completed.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.

The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims

1. The automatic workpiece grabbing method based on machine learning is characterized by comprising the following steps of:

step 1, image acquisition and pretreatment;

step 2, training a deep convolutional neural network Net 1;

step 4, detecting a workpiece target in real time;

step 5, performing space positioning on the workpiece;

step 6, completing a grabbing task on the corresponding workpiece;

2. The automatic workpiece grabbing method based on machine learning according to claim 1, wherein the specific process of image acquisition in step 1 is as follows: the binocular vision assembly is provided with a left camera and a right camera, and the left camera is used for collecting images of various postures of a workpiece in different environments.

3. The machine learning based automatic workpiece gripping method of claim 1, wherein the preprocessing comprises:

4. The automatic workpiece grabbing method based on machine learning according to claim 1, wherein an input layer of the deep convolutional neural network Net1 is a visible light image, the size of the image is w×h, convolution operation is performed by adopting a plurality of types of convolution kernels, feature images are downsampled by adjusting convolution step sizes, an excitation function is a ReLU function, and target detection is completed under three scales through a feature pooling pyramid FPN network and a convolution layer.

5. The machine learning based automatic workpiece gripping method of claim 1, wherein the loss function is:

wherein L is _conf (o, c) is a confidence loss; l (L) _cla (O, C) is a classification loss; l (L) _loc (l, g) is a loss of positioning; lambda (lambda) ₁ A balance coefficient that is a confidence loss; lambda (lambda) ₂ Balance coefficient for classification loss; lambda (lambda) ₃ To locate the balance of the losses.

6. The machine learning based automatic workpiece grabbing method of claim 1, wherein the weight of the deep convolutional neural network is updated by an Adam optimizer to obtain an updated weight, and the method specifically comprises:

m _t ＝β ₁ ·m _t-1 +(1-β ₁ )·g(ω _t )

ν _t ＝β ₂ ·ν _t-1 +(1-β ₂ )·g(ω _t )·g(ω _t )

all are calculated intermediate values.

7. The machine learning based automatic workpiece gripping method according to claim 1, wherein the calculating of the spatial depth information based on the target detection information specifically includes:

8. The automatic workpiece grabbing system based on machine learning is characterized by comprising a binocular vision component, a data preprocessing frame, a deep convolutional neural network and an information flow;

the data preprocessing framework is used for:

the deep convolutional neural network detects a workpiece target;

the information flow fuses the target detection information output by the depth convolution neural network with the image information of the left camera and the right camera of the binocular vision component, and performs space positioning on a workpiece to obtain space depth information; the spatial depth information is used to enable the corresponding workpiece to be grasped in real time.

9. A machine learning based automatic workpiece gripping apparatus, comprising a robotic arm control assembly and the machine learning based automatic workpiece gripping system of claim 8;