CN109829387B

CN109829387B - Method for determining grabbing priority of stacked serial fruits based on depth of main fruit stalks for parallel robot

Info

Publication number: CN109829387B
Application number: CN201910011311.0A
Authority: CN
Inventors: 高国琴; 张千
Original assignee: Jiangsu University
Current assignee: Jiangsu University
Priority date: 2019-01-07
Filing date: 2019-01-07
Publication date: 2023-06-16
Anticipated expiration: 2039-01-07
Also published as: CN109829387A

Abstract

The invention discloses a method for determining grabbing priority of stacked fruit strings based on the depth of main fruit stalks for a parallel robot. The method comprises the steps of constructing a stereoscopic vision detection system of stacking string fruits based on Kinect sensors under a parallel robot fruit sorting system, acquiring three-dimensional vision information of the stacking string fruits, constructing a pre-training data set by designing a depth reference object, constructing a main fruit stem depth data set of the stacking string fruits, expanding the data set and increasing the distribution range of the data set. And constructing a classification model of the depth set of the main fruit stalks, and increasing the characteristic quantity of the depth of the main fruit stalks. Designing a multi-migration learning training strategy to train the network, performing visual analysis and precision test on the network, adjusting parameters, training for multiple times until the precision meets the requirement, and realizing the deep set classification of the stacked string fruits. Finally, the accurate determination of the grabbing priority of the stacked string fruits is realized, and a foundation is laid for the parallel robot to realize the accurate, rapid and nondestructive automatic sorting of the stacked string fruits.

Description

Method for determining grabbing priority of stacked serial fruits based on depth of main fruit stalks for parallel robot

Technical Field

The invention relates to the field of machine vision, in particular to a grabbing priority determining method for determining stacked string fruits based on machine vision, image processing and convolutional neural network.

Background

In recent years, the fruit yield in China is rapidly increased, the traditional manual sorting method is difficult to meet the requirements of modern agricultural production, and the automatic sorting of fruits based on a robot technology is of great significance to the automatic, large-scale and accurate development of agricultural production and agricultural product processing. In the automatic fruit sorting process based on robots, accurate grabbing detection of fruits is a precondition for the robots to realize accurate, rapid and nondestructive grabbing control. The machine vision has the advantages of non-contact, strong applicability, high cost performance and the like, and is suitable for solving the grabbing and detecting problems of automatic fruit sorting by a robot. Compared with independent fruits such as apples, pears, pineapples and the like, the stacked fruit strings such as grapes, longan, litchis and the like still have difficulty in grabbing and detecting based on machine vision due to the fact that the fruit stems and fruit grains of the stacked fruit strings are irregularly distributed, the main fruit stems have no shape and position constraint, the fruit strings are various in shape and the like. Among them, the determination of the grasping priority is one of the challenging problems in machine vision based stacked string fruit grasping detection.

Disclosure of Invention

The invention aims to provide an accurate and rapid method for determining the grabbing priority of stacked serial fruits based on the depth of main fruit stalks for a parallel robot.

The technical scheme of the invention is as follows:

step 1, constructing a stereoscopic vision system of stacked string fruits based on Kinect sensors under a parallel robot fruit sorting system: the invention constructs a stereoscopic vision detection system for stacking serial fruits together based on a Kinect sensor and a parallel robot fruit sorting system. And placing the Kinect sensor above a detection platform of the sorting system, detecting the central axis of the platform, and selecting the region with the lowest detection error of the Kinect sensor depth image to construct a detection view field. Meanwhile, based on the detection plane size, the fruit stacking height range, the object distance range and other design visual hardware parameters, a stable and reliable stereoscopic vision detection system for stacking serial fruits is constructed, and a foundation is laid for realizing high-precision determination of the grabbing priority of the stacking serial fruits.

The method comprises the following steps: the Kinect sensor is arranged above a detection platform of the sorting system, the central axis of the detection platform is used for reducing measurement errors caused by hardware, the object distance of the Kinect sensor is selected to be between 500 and 2000mm, and the object distance is close to the center of an optical axis, so that the detection errors are detected deeply <2mm position is used as the visual detection range of the invention, and the size of the detection plane is 900 x 860mm according to the parallel robot fruit sorting system ² The height range of the fruit cluster stack is 0-200 mm, the distance from the detection platform to the Kinect sensor is 550-750 mm, and the size of the field of view plane of the sensor when 550mm and 750mm are calculated according to a triangular formula is 770.2 x 635.1mm respectively ² And 1050.3 x 866.1mm ² Therefore, a plane with an object distance of 750mm is selected as a detection plane, and a stereoscopic vision detection system for stacking fruit strings is constructed.

Step 2, constructing and expanding a depth data set with determined stacking string fruit grabbing priority: in order to solve the problems of insufficient samples, uneven depth level distribution, difficult depth acquisition and the like of the depth data set of the main fruit stalks of the stacked serial fruits, the invention constructs a pre-training data set with even depth level distribution by designing a depth reference object. A large dataset with various objects for learning the general features of edges, contours, textures, corner points and the like of the image is adopted, and a small dataset for learning the depth features of the main stalks of the stacked cluster fruits is constructed. Meanwhile, the data set is transformed and expanded by adopting an image processing method, and the specific method comprises rotation of images at various angles, horizontal mirror image of the images, vertical mirror image of the images, center mirror image of the images, various scale changes of the images, various noise adding processes of the images and the like. The method is used for increasing the distribution range of the data set and further improving the learning ability of the network on the depth characteristics of the main fruit stalks of the stacked cluster fruits. The method comprises the following steps:

Constructing a reference object depth data set with uniform depth level distribution: considering that the Kinect sensor has an average depth accuracy of 2mm in the field of view of the present invention, the depth interval of the reference object is set to 2mm, and the depth interval is 150×150×2mm ³ According to the height range of the fruit string stack and the depth range of the main fruit stalks, the distance from the detection plane to the depth camera of the Kinect sensor is discretized into a plurality of depth levels of 20mm, in the depth interval of 20mm, pictures of a plurality of depth references are acquired every 2mm depth, the positions of the references in the depth pictures of every 2mm are different, so that a large number of reference depth images with uniform depth level distribution are obtained, then an image processing method is adopted to transform and expand a data set, the specific method comprises rotation of images at various angles, horizontal mirror images of the images, vertical mirror images of the images, center mirror images of the images, various scale changes of the images and various noise processing of the images, and finally the depth images of the reference are extracted in an interested region, and an image region corresponding to the depth reference is cut out, so that a reference depth data set only comprising reference depth information is obtained;

Constructing a main fruit stem depth data set of stacked cluster fruits: and acquiring depth images of the stacked fruits in different states based on a Kinect sensor under a parallel robot fruit sorting system, transforming and expanding a data set by adopting an image processing method, and then extracting and shearing a main fruit stalk region in the stacked fruit depth images to obtain a depth data set mainly based on the depth information of the main fruit stalks of the stacked fruits.

Step 3, constructing a main fruit stem depth set classification model of a convolutional neural network architecture based on a few-pooling multi-full-connection layer: the method comprises three parts of image input, feature extraction and classification.

Because the grabbing detection view field used for the parallel robot fruit sorting system needs to cover the whole working space, the imaging of the main fruit stalks on the stacked fruit strings in the RGB-D image is smaller, and the depth characteristics of the main fruit stalks of the stacked fruit strings extracted based on the existing convolutional neural network are smaller. Therefore, the invention designs a convolution layer without scale change, a pooling layer with smaller scale change and a multi-full-connection layer, and constructs a main fruit stalk depth set classification model based on a multi-full-connection layer convolution neural network architecture with less pooling so as to reduce the reduction of image scale, increase the characteristic quantity of main fruit stalk depth extracted by a network and lay a foundation for accurately determining the grabbing priority of stacked serial fruits.

Step 3.1, the image input part: the method comprises the steps that the input image size is required to be designed in an image input part, the network input layer is designed based on the average value of the original region of interest size of an actual main fruit stalk depth image, firstly, the acquired region size after the region of interest is sheared and before scale transformation is counted, the average value of the height and the width is calculated, and an integer of the average value is taken as the network input image size;

step 3.2, feature extraction part: mainly comprises a convolution structure and a full-connection structure,

the convolution structure comprises a convolution layer without image scale change, a maximum pooling layer with smaller image scale change, an average pooling layer and a ReLu (Rectified Linear Units) activation function, and the feature map of the gauge needle main fruit stalk depth information on the original image is obtained through three convolutions without image scale change and pooling with smaller image scale change, so that the loss of the original feature information is reduced;

for a convolutional layer, calculating initial parameters of the convolutional layer based on formula (1) so that input and output feature graphs of the convolutional layer are consistent in scale, wherein os _c Is, which is the length or width of the output characteristic diagram of the convolution layer _c For the length or width of the convolution layer input characteristic diagram, fs is the size of the convolution layer filter, p _c To fill the size, s _c In steps. Edge filling of imagesThe loss of edge information in the convolution process is reduced;

for the pooling layer, the present invention includes a maximum pooling layer and an average pooling layer, each of which calculates the initial parameters of the pooling layer based on equation (2) such that the output profile of the pooling layer is input into the profile at 2 times downsampling, where os _p For the length or width of the pooling layer output feature map is _p Inputting the length or width of a characteristic diagram for a pooling layer, wherein ps is the pooling size and p _p To fill the size, s _p Is the step length;

for the ReLu activation function, the invention performs threshold operation on each input element based on the formula (3), and all values smaller than 0 are set to 0 so as to reduce data redundancy and retain important characteristics;

for fully connected structures, the ReLu activation function, fully connected layers, and discard layers are mainly included. In order to prevent overfitting, a discarding layer is added in a network, certain element weights are randomly set to 0 during training, and meanwhile, the training speed of the network can be improved;

step 3.3, classification part: mainly comprises a Softmax layer and a classification output layer. The invention designs Softmax layer and classification output layer sizes based on the number of depth levels after discretization of the distance between the detection plane and the depth camera of the Kinect sensor.

Step 4, network training based on a multi-migration learning strategy and network precision testing based on visual analysis: according to the invention, the multi-migration learning is carried out on the constructed main fruit stem depth set grading model by adopting the big data training data set, the constructed reference object depth training data set with even depth level distribution and the main fruit stem depth training data set of the stacked series fruits, so that the offline training of the network model is realized. And meanwhile, carrying out visual analysis and precision test on the trained network based on the constructed test data set, judging the precision, and if the precision does not meet the requirement, carrying out training again by adjusting parameters until the precision meets the requirement.

Step 5, on-line main fruit stem depth set grading and grabbing priority determination based on the constructed network model: the distance between the detection plane and the depth camera of the Kinect sensor is first discretized into a plurality of depth levels. And classifying a depth set formed by all depth values of each main fruit stalk into a depth level based on a trained main fruit stalk depth set classification model meeting the test precision requirement, so as to realize the depth set classification of the stacked fruit strings. And selecting the fruit string nearest to the Kinect sensor based on the grading result of all the main fruit stalk depth sets in the stacked string fruits, namely, stacking the fruit string at the uppermost part as the first grabbing string, so as to accurately determine the grabbing priority of the stacked string fruits based on the main fruit stalk depth for the parallel robot. The method comprises the following steps:

Step 5.1, obtaining RGB-D images of the stacked fruit strings: acquiring RGB-D images of stacked string fruits based on a Kinect sensor under a parallel robot fruit sorting system, and extracting depth images from the RGB-D images for detection;

step 5.2, extracting a main fruit stem depth set based on image enhancement and image morphology: preprocessing the obtained depth image of the main fruit stalks of the stacked cluster fruits based on image enhancement operations such as median filtering, histogram equalization and the like, identifying the region where the main fruit stalks are located in the depth image by morphological operation, and shearing and extracting the corresponding main fruit stalk depth region to realize extraction of a main fruit stalk depth set;

step 5.3, grading an online main fruit stem depth set based on the constructed network model: after performing scale transformation on the cut and extracted main fruit stalk depth set image, inputting a main fruit stalk depth set classification model, classifying the main fruit stalk depth set image based on a network model meeting the test precision requirement after training, and selecting a depth level with the highest classification score as the depth level of the main fruit stalk;

step 5.4, determining the grabbing priority based on the depth level of the main fruit stalks: the depth levels of different main fruit stalks on one image are ordered, the fruit string closest to the camera, namely the fruit string stacked at the uppermost part is used as the first grabbing string, and the accurate determination of the fruit grabbing priority of the stacking string based on the depth of the main fruit stalks for the parallel robot is realized.

The invention has the following beneficial effects:

1. due to factors such as various forms, complex edges and the like of the naturally placed stacked fruit strings, the grabbing priority of the stacked fruit strings cannot be determined directly through gray level or color multi-channel images. According to the invention, in the machine vision-based stacking string fruit grabbing priority determination of robot grabbing detection, a Kinect sensor-based stacking string fruit stereoscopic vision system under a parallel robot fruit sorting system is constructed, three-dimensional information of stacking string fruits can be obtained, research is carried out based on RGB-D images, and a foundation is laid for realizing stable and reliable grabbing of stacking string fruits.

2. All the depth values of each main fruit stalk can form a depth set, and intersections easily exist among a plurality of main fruit stalk depth sets in stacked fruit strings, so that a main fruit stalk depth data set with even depth level distribution is difficult to obtain. Therefore, aiming at the problems of insufficient samples, uneven depth level distribution, difficult depth acquisition and the like of the depth data set of the main fruit stalks of the stacked string fruits, the invention constructs a pre-training data set with even depth level distribution by designing a depth reference object, designs a training strategy of multi-migration learning and improves the learning capacity of a network on the depth characteristics of the main fruit stalks of the stacked string fruits.

3. Because the grabbing detection view field used for the parallel robot fruit sorting system needs to cover the whole working space, the imaging of the main fruit stalks on the stacked fruit strings in the RGB-D image is smaller, and the depth characteristics of the main fruit stalks of the stacked fruit strings extracted based on the existing convolutional neural network are smaller. Therefore, the invention designs a convolution layer without scale change, a pooling layer with smaller scale change and a multi-full-connection layer, and constructs a main fruit stem depth set classification model based on a multi-full-connection layer convolution neural network architecture with less pooling.

4. Aiming at the problem that the depth of the main fruit stalks is difficult to judge and the grabbing priority is further determined based on the depth value because all the depth values of each main fruit stalk can form a depth set, the invention provides a method for grading the depth set of the main fruit stalks in stacked fruit strings and taking the grading result as the grabbing priority of the stacked fruit strings.

Drawings

The invention is described in further detail below with reference to the drawings and the detailed description.

Fig. 1 is a flowchart of a method for determining the grabbing priority of stacked fruit strings based on the depth of main fruit stalks for a parallel robot.

FIG. 2 is a schematic diagram of the detection range of the Kinect sensor.

FIG. 3 is a flow chart of the depth dataset construction and expansion for the stacked string fruit grasp priority determination of the present invention.

Fig. 4 is a schematic diagram of a main fruit stem depth set classification model of a convolutional neural network architecture based on a few-pooling multi-full-connection layer.

Fig. 5 is a graph of convolutional layer weight parameters of a pre-trained network model.

FIG. 6 is a graph of output characteristics of a partial convolution layer and a full connection layer of a network model after retraining based on a reference depth training dataset.

FIG. 7 is a graph of output characteristics of a partial convolution layer and a full connection layer of a network model that completes the multi-transfer learning training.

Detailed Description

The following further describes embodiments of the present invention with reference to the drawings.

The invention discloses a method for determining grabbing priority of stacked fruit strings based on the depth of main fruit stalks for a parallel robot. Firstly, a stereoscopic vision detection system for stacking string fruits is built based on a Kinect sensor under a parallel robot fruit sorting system, and three-dimensional vision information of the stacking string fruits is obtained. And then, constructing a pre-training data set with evenly distributed depth levels by designing a depth reference object, and constructing a main fruit stem depth data set of the stacked cluster fruits. Meanwhile, the constructed data set is expanded based on image rotation, mirror image change, scale change and noise adding processing methods, so that the distribution range of the data set is increased. And then, designing a convolution layer without scale change, a pooling layer with smaller scale change and a multi-full-connection layer, and constructing a main fruit stalk depth set classification model based on a pooling multi-full-connection layer convolution neural network architecture so as to reduce the reduction of image scale and increase the characteristic quantity of main fruit stalk depth extracted by a network under a large visual field of vision detection required in a parallel robot fruit sorting system. And then, designing a multi-migration learning training strategy based on the pre-training data set with uniform depth level distribution and the main fruit stem depth data set of the stacked string fruits to train the network, and carrying out visual analysis and precision test on the network based on the test data set. If the precision does not meet the requirement, the multi-transfer learning training is performed again by adjusting the parameters until the precision meets the requirement. Finally, the distance between the detection plane and the depth camera of the Kinect sensor is discretized into a plurality of depth levels. And classifying a depth set formed by all depth values of each main fruit stalk into a depth level based on a trained main fruit stalk depth set classification model meeting the test precision requirement, so as to realize the depth set classification of the stacked fruit strings. And determining the grabbing priority based on the grading result of all the main fruit stalk depth sets in the stacked string fruits, and finally, accurately determining the grabbing priority of the stacked string fruits based on the main fruit stalk depths for the parallel robot. Lay a foundation for realizing accurate, rapid and nondestructive automatic sorting of stacked serial fruits based on stereoscopic vision parallel robots.

The specific embodiment takes a parallel robot fruit sorting system developed by the subject group as an example and the grape cluster with the white rossa as an object to be described.

Referring to fig. 1, the specific steps are as follows:

step 1: and (3) constructing a stereoscopic vision system of stacked string fruits based on Kinect sensors under the parallel robot fruit sorting system.

The invention constructs a stereoscopic vision detection system for stacking serial fruits together based on a Kinect sensor and a parallel robot fruit sorting system. And placing the Kinect sensor above a detection platform of the sorting system, and detecting the central axis of the platform.

Referring to FIG. 2, in order to reduce the measurement error caused by hardware, the object distance of the Kinect sensor is selected to be between 500 and 2000mm, and the object distance is close to the center of the optical axis, so that the error is detected in depth<A position of 2mm is taken as the visual inspection range of the present invention. According to the parallel robot fruit sorting system, the size of a detection plane is 900 x 860mm ² The stacking height range of the fruit strings is 0-200 mm, and the distance from the detection platform to the Kinect sensor is 550-750 mm. Obtaining the respective field plane sizes of 770.2635.1 mm for the sensor at 550mm and 750mm according to formulas (1) - (6) ² And 1050.3 x 866.1mm ² Thus, a plane with an object distance of 750mm was selected as the detection plane.

FOV _h1 ＝L _CD ＝2L _CM (3)

FOV _v1 ＝L _AB ＝2L _AM (4)

FOV _h2 ＝L _GH ＝2L _GN (5)

FOV _v2 ＝L _EF ＝2L _EN (6)

Wherein FOV (field of view) _h1 And FOV (field of view) _v1 The length and width of the field of view, FOV, at an object distance of 550mm, respectively _h2 And FOV (field of view) _v2 The length and width of the field at an object distance of 750mm, respectively.

Under the condition of a maximum field of view with an object distance of 750mm, obtaining a range of 5-20 mm of main fruit obstruction of the actually stacked fruit strings based on the formula (7), wherein in the obtained depth image, the range of the pixels corresponding to the main fruit stalks is 2.44-9.76 pixels, and the pixel requirement of detection is met.

Wherein, FOV is the length or width of the field of view, pa is the actual diameter of the main fruit stem, and the units are mm. DR is the length or width of the depth image resolution, pi is the diameter of the main fruit stem mapped on the depth image, and the units are pixels.

Step 2: and constructing and expanding a depth data set determined by the grabbing priority of the stacked string fruits.

Referring to fig. 3, in order to solve the problems of insufficient samples, uneven depth level distribution, difficult depth acquisition and the like of the depth data set of the main fruit stalks of the stacked string fruits, the invention constructs a pre-training data set with even depth level distribution by designing a depth reference object. A large dataset with various objects for learning the general features of edges, contours, textures, corner points and the like of the image is adopted, and a small dataset for learning the depth features of the main stalks of the stacked cluster fruits is constructed. Meanwhile, the data set is transformed and expanded by adopting an image processing method, and the specific method comprises rotation of images at various angles, horizontal mirror image of the images, vertical mirror image of the images, center mirror image of the images, various scale changes of the images, various noise adding processes of the images and the like. The method is used for increasing the distribution range of the data set and further improving the learning ability of the network on the depth characteristics of the main fruit stalks of the stacked cluster fruits.

The method comprises the following steps:

referring to fig. 1, the data set constructed by the present invention includes a training data set and a test data set. The training data set consists of a large data set, a reference object depth training data set with evenly distributed depth levels and a main fruit stem depth training data set of stacked fruit strings. The test dataset is derived from the main stem depth dataset of the stacked cluster of fruits.

Large dataset: existing large data sets such as CIFAR-10 and ImageNet are adopted for the pre-training network.

Reference depth training dataset with uniform depth level distribution: considering that the Kinect sensor has an average depth accuracy of 2mm in the field of view of the present invention, the depth interval of the reference object is set to 2mm, and the depth interval is 150×150×2mm ³ As a depth reference. The distance between the detection plane and the depth camera of the Kinect sensor is discretized into a plurality of depth levels of 20mm step size according to the height range of the fruit cluster stack and the depth range of the main fruit stalks. In the depth interval of 20mm, a plurality of pictures of depth references are acquired every 2mm depth, and the positions of the references in the pictures of the depth of 2mm are different. Thus, a large number of reference object depth images with evenly distributed depth levels are obtained. The data set is transformed and expanded by adopting an image processing method, and the specific method comprises rotation of images at various angles, horizontal mirror image of the images, vertical mirror image of the images, center mirror image of the images, various scale changes of the images, various noise adding processes of the images and the like. And finally, extracting an interested region from the depth image of the reference object, and cutting out an image region corresponding to the depth reference object to obtain a depth data set of the reference object only containing the depth information of the reference object.

Stacking a main fruit stem depth training data set and a test data set of the cluster fruit: and acquiring depth images of stacked serial fruits in different states based on a Kinect sensor under a parallel robot fruit sorting system, and transforming and expanding a data set by adopting an image processing method. And then, extracting and shearing the main fruit stalk region in the stacked string fruit depth image to obtain a depth image set mainly based on the stacked string fruit main fruit stalk depth information. Randomly selecting 60% of the stacking string type fruit main fruit stalk depth data set as a stacking string type fruit main fruit stalk depth training set, and randomly selecting 40% of the stacking string type fruit main fruit stalk depth data set as a stacking string type fruit main fruit stalk depth test set.

Step 3: and constructing a main fruit stalk depth set hierarchical model of the convolutional neural network architecture based on the less-pooling multi-full-connection layer. Referring to fig. 4, since the grabbing detection view field used in the parallel robot fruit sorting system of the present invention needs to cover the whole working space, the imaging of the main fruit stalks on the stacked fruit strings in the RGB-D image is smaller, resulting in less depth features of the main fruit stalks of the stacked fruit strings extracted based on the existing convolutional neural network. Therefore, the invention designs a convolution layer without scale change, a pooling layer with smaller scale change and a multi-full-connection layer, and constructs a main fruit stalk depth set classification model based on a multi-full-connection layer convolution neural network architecture with less pooling so as to reduce the reduction of image scale, increase the characteristic quantity of main fruit stalk depth extracted by a network and lay a foundation for accurately determining the grabbing priority of stacked serial fruits.

The method comprises the following steps:

the main fruit stem depth set classification model based on the convolutional neural network architecture of the less-pooling multi-full-connection layer comprises three parts, namely image input, feature extraction and classification.

(1) An image input part of a main fruit stalk depth set grading model of a convolutional neural network architecture based on a few-pooling multi-full-connection layer. The input image size needs to be designed at the image input section. The invention designs a network input layer based on the average value of the original interested area size of the actual main fruit stalk depth image. Firstly, counting the acquired region size before scale transformation after the region of interest is sheared, calculating the average value of the height and the width of the region, and taking the integer of the average value as the size of the network input image.

(2) The feature extraction part of the main fruit stalk depth set classification model of the convolution neural network architecture based on the less-pooling multi-full-connection layer mainly comprises a convolution structure and a full-connection structure.

The convolution structure includes a convolution layer with no image scale change, a max-pooling layer with small image scale change, and an average pooling layer, reLu (Rectified Linear Units) activation function. According to the invention, the characteristic map of the gauge needle main fruit stalk depth information on the original image is obtained through three convolutions without image scale change and pooling with smaller image scale change, so that the loss of the original characteristic information is reduced.

For the convolutional layer, calculate the initial parameters of the convolutional layer based on equation (8) such that the convolutional layerThe input and output feature maps are consistent in scale. Wherein os _c Is, which is the length or width of the output characteristic diagram of the convolution layer _c For the length or width of the convolution layer input characteristic diagram, fs is the size of the convolution layer filter, p _c To fill the size, s _c In steps. And (3) edge filling is carried out on the image, so that the loss of edge information in the convolution process is reduced.

For the pooling layer, the invention comprises a maximum pooling layer and an average pooling layer, and the initial parameters of the pooling layer are calculated based on the formula (9), so that the output characteristic diagram of the pooling layer is subjected to 2 times downsampling of the input characteristic diagram. Wherein os _p For the length or width of the pooling layer output feature map is _p Inputting the length or width of a characteristic diagram for a pooling layer, wherein ps is the pooling size and p _p To fill the size, s _p In steps.

For the ReLu activation function, the invention performs a threshold operation on each element of the input based on equation (10), with all values less than 0 set to 0, to reduce data redundancy, preserving important features.

For fully connected structures, the ReLu activation function, fully connected layers, and discard layers are mainly included. In order to prevent over fitting, a discarding layer is added in the network, and certain element weights are randomly set to 0 during training, so that the network training speed can be improved. Meanwhile, the invention is used for improving the learning ability of the network to the extracted feature map by increasing the number of the full connection layers.

(3) The classification part of the main fruit stem depth set classification model of the convolutional neural network architecture based on the less-pooling multi-full-connection layer mainly comprises a Softmax layer and a classification output layer. The invention designs Softmax layer and classification output layer sizes based on the number of depth levels after discretization of the distance between the detection plane and the depth camera of the Kinect sensor.

Step 4: network training based on a multi-migration learning strategy and network accuracy testing based on visual analysis.

Referring to fig. 1, the invention adopts a big data training data set, a reference object depth training data set with even depth level distribution and a main fruit stalk depth training data set of stacked series fruits to carry out multi-migration learning on a built main fruit stalk depth set grading model, thereby realizing offline training of a network model. And meanwhile, carrying out visual analysis and precision test on the trained network based on the constructed test data set, judging the precision, and if the precision does not meet the requirement, carrying out multi-migration learning training again by adjusting parameters until the precision meets the requirement.

The method comprises the following specific steps:

(1) Network training based on multi-migration learning strategy: the invention adopts a multi-migration learning method, and combines a big data training data set, a built reference object depth training data set with even depth level distribution and a main fruit stem depth training data set of stacked series fruits to train a built main fruit stem depth set grading model. First, training a model based on a large dataset, and learning general features such as edges, textures, directions and the like of an image. And then retraining the network model by using a reference object depth training data set with even depth level distribution, and learning depth level characteristics in the image. And finally, fine-tuning the model by adopting a main fruit stem depth training data set of the stacked fruit clusters, and learning main fruit stem depth characteristics of the stacked fruit clusters in the image.

(2) Network precision test based on visual analysis: the invention adopts visual analysis to evaluate the trained network learning ability. Firstly, analyzing the learning ability of the network model to general features such as edges, textures, directions and the like of an image by extracting the weight parameters of a convolution layer of the pre-trained network model. And then evaluating the trained network model, respectively extracting output feature graphs of different convolution layers and full connection layers of the network model, analyzing the output feature graphs, and evaluating the learning ability of the network model to the depth level. And finally, counting and calculating the detection precision of the main fruit stalk depth set classification model by adopting the main fruit stalk depth test data set of the stacked cluster fruits, judging the precision, and if the precision does not meet the requirement, training again by adjusting parameters until the precision meets the requirement.

Step 5: and (5) grading and grabbing priority determination based on the on-line main fruit stem depth set of the constructed network model.

Referring to fig. 1, the distance between the detection plane and the depth camera of the Kinect sensor is first discretized into a plurality of depth levels. And classifying a depth set formed by all depth values of each main fruit stalk into a depth level based on a trained main fruit stalk depth set classification model meeting the test precision requirement, so as to realize the depth set classification of the stacked fruit strings. And selecting the fruit string nearest to the Kinect sensor based on the grading result of all the main fruit stalk depth sets in the stacked string fruits, namely, stacking the fruit string at the uppermost part as the first grabbing string, so as to accurately determine the grabbing priority of the stacked string fruits based on the main fruit stalk depth for the parallel robot.

The method comprises the following specific steps:

(1) RGB-D image acquisition of stacked string fruit: and acquiring RGB-D images of the stacked string fruits based on a Kinect sensor under the parallel robot fruit sorting system, and extracting depth images from the RGB-D images for detection.

(2) Main fruit stalk depth set extraction based on image enhancement and image morphology: and preprocessing the obtained depth image of the main fruit stalks of the stacked cluster fruits based on image enhancement operations such as median filtering, histogram equalization and the like. And then, identifying the region where the main fruit stalks are located in the depth map by adopting morphological operation, and shearing and extracting the corresponding main fruit stalk depth region to realize the extraction of the main fruit stalk depth set.

(3) On-line main fruit stem depth set grading based on constructed network model: and performing scale transformation on the cut and extracted main fruit stalk depth set image, and then using the cut and extracted main fruit stalk depth set image as input of a main fruit stalk depth set grading model. And classifying the depth set image of the main fruit stalks based on the trained network model meeting the test precision requirement, and selecting the depth level with the highest classification score as the depth level of the main fruit stalks.

(4) Determining the grabbing priority based on the depth level of the main fruit stalks: the depth levels of different main fruit stalks on one image are ordered, the fruit string closest to the camera, namely the fruit string stacked at the uppermost part is used as the first grabbing string, and the accurate determination of the fruit grabbing priority of the stacking string based on the depth of the main fruit stalks for the parallel robot is realized.

To this end, the determination of the priority of the stacked cluster fruit gripping based on the depth of the main fruit stalks by the parallel robot has been completed.

Examples

The invention provides a method for determining the grabbing priority of stacked serial fruits based on the depths of main fruit stalks for parallel robots, which solves the problems that the depth of each main fruit stalk is difficult to judge and the grabbing priority is further determined based on the depth because all the depth values of each main fruit stalk can form a depth set. The depth reference object is designed to construct a pre-training data set with even depth level distribution, and a training strategy of multi-migration learning is designed to improve the learning capacity of a network on the depth characteristics of the main fruit stems of the stacked fruit strings. And then, designing a convolution layer without scale change, a pooling layer with smaller scale change and a multi-full-connection layer, and constructing a main fruit stem depth set grading model based on a multi-full-connection layer convolution neural network architecture with less pooling. Finally, the distance between the detection plane and the depth camera of the Kinect sensor is discretized into a plurality of depth levels. And classifying a depth set formed by all depth values of each main fruit stalk into a depth level based on a trained main fruit stalk depth set classification model meeting the test precision requirement, so as to realize the depth set classification of the stacked fruit strings. And determining the grabbing priority based on the grading result of all the main fruit stalk depth sets in the stacked string fruits, and finally, accurately determining the grabbing priority of the stacked string fruits based on the main fruit stalk depths for the parallel robot. Lay a foundation for realizing accurate, rapid and nondestructive automatic sorting of stacked serial fruits based on stereoscopic vision parallel robots.

The specific embodiment takes the parallel robot fruit sorting system developed by the subject group as an example and the bai rossa grape as an object to describe. The specific implementation mode is as follows:

1. the three-dimensional vision system construction of stacked string fruits based on Kinect sensors under the parallel robot fruit sorting system comprises the following steps: the invention constructs a stereoscopic vision detection system for stacking serial fruits together based on a Kinect sensor and a parallel robot fruit sorting system. And placing the Kinect sensor above a detection platform of the sorting system, and detecting the central axis of the platform.

In order to reduce measurement errors caused by hardware, the object distance of the Kinect sensor is selected to be between 500 and 2000mm, and the object distance is close to the center of an optical axis, so that the error is detected in depth<A position of 2mm is taken as the visual inspection range of the present invention. According to the parallel robot fruit sorting system, the size of a detection plane is 900 x 860mm ² The stacking height range of the fruit strings is 0-200 mm, and the distance from the detection platform to the Kinect sensor is 550-750 mm. Obtaining the respective field plane sizes of 770.2635.1 mm for the sensor at 550mm and 750mm according to formulas (1) - (6) ² And 1050.3 x 866.1mm ² Thus, a plane with an object distance of 750mm was selected as the detection plane.

FOV _h1 ＝L _CD ＝2L _CM (3)

FOV _v1 ＝L _AB ＝2L _AM (4)

FOV _h2 ＝L _GH ＝2L _GN (5)

FOV _v2 ＝L _EF ＝2L _EN (6)

Wherein FOV (field of view) _h1 And FOV (field of view) _v1 Respectively isField length and width at an object distance of 550mm, FOV _h2 And FOV (field of view) _v2 The length and width of the field at an object distance of 750mm, respectively.

2. Constructing and expanding a depth data set with determined stacking string fruit grabbing priority: in order to solve the problems of insufficient samples, uneven depth level distribution, difficult depth acquisition and the like of the depth data set of the main fruit stalks of the stacked serial fruits, the invention constructs a pre-training data set with even depth level distribution by designing a depth reference object. A large dataset with various objects for learning the general features of edges, contours, textures, corner points and the like of the image is adopted, and a small dataset for learning the depth features of the main stalks of the stacked cluster fruits is constructed. Meanwhile, the data set is transformed and expanded by adopting an image processing method, and the specific method comprises rotation of images at various angles, horizontal mirror image of the images, vertical mirror image of the images, center mirror image of the images, various scale changes of the images, various noise adding processes of the images and the like. The method is used for increasing the distribution range of the data set and further improving the learning ability of the network on the depth characteristics of the main fruit stalks of the stacked cluster fruits.

The method comprises the following steps:

the data set constructed by the invention comprises a training data set and a test data set. The training data set consists of a large data set, a reference object depth training data set with evenly distributed depth levels and a main fruit stem depth training data set of stacked fruit strings. The test dataset is derived from the main stem depth dataset of the stacked cluster of fruits.

3. Main fruit stalk depth set classification model construction of convolutional neural network architecture based on few-pooling multi-full-connection layer: because the grabbing detection view field used for the parallel robot fruit sorting system needs to cover the whole working space, the imaging of the main fruit stalks on the stacked fruit strings in the RGB-D image is smaller, and the depth characteristics of the main fruit stalks of the stacked fruit strings extracted based on the existing convolutional neural network are smaller. Therefore, the invention designs a convolution layer without scale change, a pooling layer with smaller scale change and a multi-full-connection layer, and constructs a main fruit stalk depth set classification model based on a multi-full-connection layer convolution neural network architecture with less pooling so as to reduce the reduction of image scale, increase the characteristic quantity of main fruit stalk depth extracted by a network and lay a foundation for accurately determining the grabbing priority of stacked serial fruits.

The method comprises the following steps:

For the convolution layer, calculating initial parameters of the convolution layer based on the formula (8) so that the input and output characteristic diagrams of the convolution layer are consistent in scale. Wherein os _c Is, which is the length or width of the output characteristic diagram of the convolution layer _c For the length or width of the convolution layer input characteristic diagram, fs is the size of the convolution layer filter, p _c To fill the size, s _c In steps. And (3) edge filling is carried out on the image, so that the loss of edge information in the convolution process is reduced.

4. Network training based on multi-migration learning strategy and network precision test based on visual analysis: according to the invention, the multi-migration learning is carried out on the constructed main fruit stem depth set grading model by adopting the big data training data set, the constructed reference object depth training data set with even depth level distribution and the main fruit stem depth training data set of the stacked series fruits, so that the offline training of the network model is realized. And meanwhile, carrying out visual analysis and precision test on the trained network based on the constructed test data set, judging the precision, and if the precision does not meet the requirement, carrying out multi-migration learning training again by adjusting parameters until the precision meets the requirement.

The method comprises the following specific steps:

5. On-line main fruit stem depth set grading and grabbing priority determination based on constructed network model: the distance between the detection plane and the depth camera of the Kinect sensor is first discretized into a plurality of depth levels. And classifying a depth set formed by all depth values of each main fruit stalk into a depth level based on a trained main fruit stalk depth set classification model meeting the test precision requirement, so as to realize the depth set classification of the stacked fruit strings. And selecting the fruit string nearest to the Kinect sensor based on the grading result of all the main fruit stalk depth sets in the stacked string fruits, namely, stacking the fruit string at the uppermost part as the first grabbing string, so as to accurately determine the grabbing priority of the stacked string fruits based on the main fruit stalk depth for the parallel robot.

The method comprises the following specific steps:

According to the embodiment, machine vision hardware for collecting stacked string fruit images is built based on a parallel robot fruit sorting system, depth images of the bai-sha grapes in different forms are collected based on Kinect sensors through a platform, a stacked string fruit grabbing priority determination experiment based on a convolutional neural network is conducted, the depth level corresponding to the main fruit stalks of each string of grapes is obtained, the fruit string closest to the camera is selected as the first grabbing string, and accurate determination of the stacked string fruit grabbing priority based on the depths of the main fruit stalks for the parallel robot is achieved. Lay a foundation for realizing accurate, rapid and nondestructive automatic sorting of stacked serial fruits based on stereoscopic vision parallel robots.

In summary, the method for determining the grabbing priority of stacked fruit strings based on the depth of the main fruit stalks for the parallel robot is disclosed. Firstly, a stereoscopic vision detection system for stacking string fruits is built based on a Kinect sensor under a parallel robot fruit sorting system, and three-dimensional vision information of the stacking string fruits is obtained. And then, constructing a pre-training data set with evenly distributed depth levels by designing a depth reference object, and constructing a main fruit stem depth data set of the stacked cluster fruits. Meanwhile, the constructed data set is expanded based on image rotation, mirror image change, scale change and noise adding processing methods, so that the distribution range of the data set is increased. And then, designing a convolution layer without scale change, a pooling layer with smaller scale change and a multi-full-connection layer, and constructing a main fruit stalk depth set classification model based on a pooling multi-full-connection layer convolution neural network architecture so as to reduce the reduction of image scale and increase the characteristic quantity of main fruit stalk depth extracted by a network under a large visual field of vision detection required in a parallel robot fruit sorting system. And then, designing a multi-migration learning training strategy based on the pre-training data set with uniform depth level distribution and the main fruit stem depth data set of the stacked string fruits to train the network, and carrying out visual analysis and precision test on the network based on the test data set. If the precision does not meet the requirement, the multi-transfer learning training is performed again by adjusting the parameters until the precision meets the requirement. Finally, the distance between the detection plane and the depth camera of the Kinect sensor is discretized into a plurality of depth levels. And classifying a depth set formed by all depth values of each main fruit stalk into a depth level based on a trained main fruit stalk depth set classification model meeting the test precision requirement, so as to realize the depth set classification of the stacked fruit strings. And determining the grabbing priority based on the grading result of all the main fruit stalk depth sets in the stacked string fruits, and finally, accurately determining the grabbing priority of the stacked string fruits based on the main fruit stalk depths for the parallel robot. Lay a foundation for realizing accurate, rapid and nondestructive automatic sorting of stacked serial fruits based on stereoscopic vision parallel robots.

In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. The method for determining the grabbing priority of the stacked cluster fruits based on the depth of the main fruit stalks for the parallel robot is characterized by comprising the following steps of:

step 1, constructing a stereoscopic vision detection system for stacking serial fruits together based on a Kinect sensor and a parallel robot fruit sorting system;

Step 2, constructing and expanding a depth data set with determined stacking string fruit grabbing priority, and constructing a pre-training data set with uniform depth level distribution by designing a depth reference object;

step 3, constructing a main fruit stem depth set hierarchical model based on a convolution neural network architecture of a few-pool multi-full-connection layer, wherein the main fruit stem depth set hierarchical model comprises three parts, namely image input, feature extraction and classification;

the step 3 specifically comprises the following steps:

step 3.1, the image input part: designing an input image size in an image input part, designing a network input layer based on the average value of the original region of interest size of an actual main fruit stalk depth image, firstly counting the acquired region size after the region of interest is sheared and before scale transformation, calculating the average value of the height and the width of the region size, and taking an integer of the average value as the size of the network input image;

the convolution structure comprises a convolution layer without image scale change, a maximum pooling layer with image scale change, an average pooling layer and a ReLu activation function, and a feature map of the gauge needle main fruit stalk depth information on an original image is obtained through three convolutions without image scale change and pooling with image scale change;

For a convolutional layer, calculating initial parameters of the convolutional layer based on formula (1) so that input and output feature graphs of the convolutional layer are consistent in scale, wherein os _c Is, which is the length or width of the output characteristic diagram of the convolution layer _c Is the length or width of the input characteristic diagram of the convolution layer, fs is the size of the convolution layer filter，p _c To fill the size, s _c Edge filling is carried out on the image for step length, so that loss of edge information in the convolution process is reduced;

for the pooling layer, including the maximum pooling layer and the average pooling layer, the pooling layer initial parameters are calculated based on equation (2) such that the output profile of the pooling layer is input into the profile at 2-fold downsampling, where os _p For the length or width of the pooling layer output feature map is _p Inputting the length or width of a characteristic diagram for a pooling layer, wherein ps is the pooling size and p _p To fill the size, s _p Is the step length;

for the ReLu activation function, performing a threshold operation on each element input based on formula (3), all values less than 0 being set to 0 for reducing data redundancy, preserving important features;

for the full connection structure, mainly comprising a ReLu activation function, a full connection layer and a discarding layer, in order to prevent overfitting, the discarding layer is added into the network, and certain element weights are randomly set to be 0 during training;

Step 3.3, classification part: the method mainly comprises a Softmax layer and a classification output layer, wherein the size of the Softmax layer and the size of the classification output layer are designed based on the number of depth levels after the distance between a detection plane and a depth camera of a Kinect sensor is discretized;

step 4, performing network training of a multi-migration learning strategy and network precision testing of visual analysis on the main fruit stem depth set classification model;

and 5, grading the on-line main fruit stem depth set and determining the grabbing priority on the basis of the constructed network model.

2. The method for determining the grabbing priority of stacked fruit strings based on the depth of main fruit stalks for parallel robots according to claim 1, wherein the method comprises the following steps: the step 1 specifically comprises the following steps:

the Kinect sensor is arranged above a detection platform of the sorting system, the central axis of the detection platform is used for reducing measurement errors caused by hardware, the object distance of the Kinect sensor is selected to be between 500 and 2000mm, and the object distance is close to the center of an optical axis, so that the detection errors are detected deeply<2mm position is used as visual detection range, and according to the parallel robot fruit sorting system, the size of a detection plane is 900 x 860mm ² The height range of the fruit cluster stack is 0-200 mm, the distance from the detection platform to the Kinect sensor is 550-750 mm, and the size of the field of view plane of the sensor when 550mm and 750mm are calculated according to a triangular formula is 770.2 x 635.1mm respectively ² And 1050.3 x 866.1mm ² Therefore, a plane with an object distance of 750mm is selected as a detection plane, and a stereoscopic vision detection system for stacking fruit strings is constructed.

3. The method for determining the grabbing priority of stacked fruit strings based on the depth of main fruit stalks for parallel robots according to claim 1, wherein the method comprises the following steps: in the step 2, a reference object depth data set with uniform depth level distribution is constructed, the data set is transformed and expanded by adopting an image processing method, the region of interest is extracted from the reference object depth image, the reference object depth data set only containing reference object depth information is obtained, and a main fruit stem depth data set of stacked series fruits is constructed.

4. A method for determining the gripping priority of stacked fruit strings based on the depth of fruit stalks for parallel robots according to claim 3, wherein:

constructing a reference object depth data set with uniform depth level distribution: considering that the average depth accuracy in the field of view of the Kinect sensor is 2mm, reference will be made toThe depth interval of the objects is set to be 2mm, and the preparation size is 150-2 mm ³ According to the height range of the fruit string stack and the depth range of the main fruit stalks, the distance from the detection plane to the depth camera of the Kinect sensor is dispersed into a plurality of depth levels with the step length of 20mm, in the depth interval of 20mm, pictures of a plurality of depth references are acquired every 2mm depth, the positions of the references in the depth pictures of every 2mm are different, so that a large number of reference depth images with uniform depth level distribution are obtained, then an image processing method is adopted to transform and expand a data set, and the specific method comprises rotation of images at various angles, horizontal mirror images of the images, vertical mirror images of the images, center mirror images of the images, various scale changes of the images and various noise adding processes of the images, and finally the depth images of the reference are extracted in an interested region, and an image region corresponding to the depth reference is cut out, so that the depth data set of the reference is obtained.

5. A method for determining the gripping priority of stacked fruit strings based on the depth of fruit stalks for parallel robots according to claim 3, wherein:

6. The method for determining the grabbing priority of stacked fruit strings based on the depth of main fruit stalks for parallel robots according to claim 1, wherein the method comprises the following steps: the step 4 specifically comprises the following steps:

and performing multi-migration learning on the constructed main fruit stalk depth set grading model by adopting a big data training dataset, a reference object depth training dataset with uniform depth level distribution and a main fruit stalk depth training dataset of stacked series fruits, so as to realize offline training of a network model, performing visual analysis and precision testing on the trained network based on a constructed test dataset, judging the precision, and if the precision does not meet the requirement, performing training again by adjusting parameters until the precision meets the requirement.

7. The method for determining the grabbing priority of stacked fruit strings based on the depth of main fruit stalks for parallel robots according to claim 1, wherein the method comprises the following steps: the step 5 specifically comprises the following steps:

step 5.2, extracting a main fruit stem depth set based on image enhancement and image morphology: preprocessing the obtained depth image of the main fruit stalks of the stacked cluster fruits based on median filtering and histogram equalization image enhancement operation, identifying the region where the main fruit stalks are located in the depth image by morphological operation, and shearing and extracting the corresponding main fruit stalk depth region to realize extraction of a main fruit stalk depth set;