CN112528826A

CN112528826A - Picking device control method based on 3D visual perception

Info

Publication number: CN112528826A
Application number: CN202011414617.XA
Authority: CN
Inventors: 唐玉新; 唐双凌; 徐陶
Original assignee: Jiangsu Yanjiang Agricultural Science Research Institute
Current assignee: Jiangsu Yanjiang Agricultural Science Research Institute
Priority date: 2020-12-04
Filing date: 2020-12-04
Publication date: 2021-03-19
Anticipated expiration: 2040-12-04
Also published as: CN112528826B

Abstract

The invention provides a picking device control method based on 3D visual perception, which comprises the following steps: s1, the control device controls the picking device to walk along the lower part of the fruit support needing picking; s2, identifying the target fruit by the control device through the image acquisition device of the picking device, and S3, removing noise points from the background by the image acquisition device through 3D color threshold processing; s4, detecting and positioning the mature fruits by using deep learning through a control device; s5, selecting an interested area around the target fruit to determine the existence of immature fruit; and S6, calculating the picking path based on the distribution and the quantity of the immature fruits around the target fruits. The invention actively distinguishes the immature fruit from the target based on visual perception, and selects a single pushing operation or a snake-shaped pushing operation consisting of a plurality of linear pushing operations according to the distribution condition of the immature fruit around the target fruit so as to distinguish the immature fruit under the target fruit and at the same height of the target, thereby remarkably improving the sorting performance and greatly improving the picking efficiency.

Description

Picking device control method based on 3D visual perception

Technical Field

The application relates to the field of intelligent robots, in particular to a control method of a picking device based on 3D visual perception.

Background

The fruit picking mode in China is generally manual picking, the labor cost accounts for 50% -70% of the total cost of fruits and vegetables, and the picking mode is high in cost, low in efficiency and difficult to realize high-altitude operation. The fruit picking robot is a device which replaces manpower and can automatically pick fruits. At present, domestic fruit picking robots are still in the initial stage of development, and most of the fruit picking robots cannot meet the requirements of fruit growers on picking fruits. At present, the end effector of the fruit picking robot has no buffer stage when clamping fruits, and the fruits are delicate and easy to damage the fruits. In addition, the fruit picking robot needs to selectively harvest fruits due to the difference in the mature period of the fruits. The existing fruit picking robot has great defects in the aspects of identifying and selecting single mature fruits, and not accidentally damaging or accidentally selecting immature fruits for picking. Although there are also methods for exploring the search space of the end effector for feasible trajectories by image recognition and using search algorithms, where each step of the trajectory is planned by the collision detector. Most of the methods are passive methods with the goal of avoiding unripe fruit or other parts without changing the environment. However, unripe fruit is not always avoidable, and when the target fruit is completely surrounded by unripe fruit, there may be a problem in that the end effector cannot pick the fruit by finding a way to avoid all the unripe fruit.

Disclosure of Invention

In order to solve the above problems in the prior art, the present invention provides a picking device control method based on 3D visual perception.

The invention discloses a picking device control method based on 3D visual perception, which is characterized by comprising the following steps: the method comprises the following steps:

s1, the control device controls the picking device to walk along the lower part of the fruit support needing picking;

s2, the control device identifies the target fruit through the image acquisition device of the picking device,

s3, the image acquisition device removes noise points from the background through 3D color threshold processing;

s4, detecting and positioning the mature fruits by using deep learning through a control device;

s5, selecting an interested area around the target fruit to determine the existence of immature fruit;

and S6, calculating the picking path based on the distribution and the quantity of the immature fruits around the target fruits.

In step S2, adjacent noise points are removed by using hue, saturation and intensity color threshold processing,

step S4 includes:

s41, identifying and segmenting the object at the pixel level by using a segmented convolutional neural network; creating, over the network, a number of masks for the ripe fruit, wherein one mask represents the detected target fruit; projecting the mask into 3D points by matching with the depth image to obtain the 3D position of the target fruit in the coordinates of the camera frame;

s42, transforming the coordinates from the camera frame to an arm frame of the picking device based on the camera external calibration device;

in step S5, a bounding box of the block in each region of interest in the point cloud is clipped using the point cloud library, and identification and calculation of immature fruits are performed on the corresponding block.

Wherein the region of interest is a region of a 3D point cloud containing the target fruit and potentially one or more unripe fruits; the region of interest is divided into four layers: a top layer, an upper middle layer, a lower middle layer and a bottom layer; each layer of the region of interest is divided into nine cubic blocks, the blocks form a 3 x 3 grid, and the center of the grid is positioned at the horizontal midpoint of the target fruit; such that the position in the xy plane is centered on the block C_CSurrounding the target fruit; the length and width of the outer eight peripheral blocks are equal to the length and width of the central block; the top layer and the bottom layer have a height equal to that of the upper middle layer and the lower middle layer, respectively, in the front view and the left side viewOne and two times the sum of heights; the gripper is moved upward to distinguish the immature fruit around the target fruit in the upper and lower intermediate levels, and the distribution of the immature fruit in the upper and lower intermediate levels may vary in the height direction.

In particular, the gripper operates in three distinct phases: in a first phase, the gripper grabs from below, the gripper moving the unripe fruit horizontally in the bottom layer; during a second stage, the gripper moves upward to encompass the target fruit and distinguish between immature fruit in the upper and lower intermediate levels; during the third stage, if the center block C in the top layer_COccupied, the gripper can then pull the target fruit to a gripping position with less unripe fruit.

In particular, the first stage is to distinguish the unripe fruit horizontally below the target fruit in the bottom layer, using the number Nh of central blocks adjacent to the unripe fruit free blocks to determine whether to use a single push operation or a snake operation;

ignoring the central region, the solid arrows in the region indicate that the region is occupied by unripe fruit, and the blank arrows indicate unoccupied regions; nh is 5 and greater than a predetermined threshold Th of 4, selecting a single push operation to push unripe fruit aside; when a single pushing operation moves towards the unripe fruit, the direction of the pushing operation of the gripper is calculated according to the following formula, based on the position of the occupied zone:

where Oi is the vector of the ith occupied block within the largest adjacent occupied block group, and n is the total number of blocks within the largest adjacent occupied block group; the parameter r is used to scale the Ds norm, which should ensure that the clamp is disengaged from the outside of the block, r is 50 mm;

the gripper moves from the centre of an unoccupied tile to the centre of an occupied tile, so that the gripper has the highest probability of pushing all tiles aside;

if only the center patch Cc is occupied, Ds is 0; the direction in which the gripper must move to push the unripe fruit is determined by calculating the shortest path from the current position of the gripper to the center of the center block CC. If no unripe fruit is detected in the section, the gripper has no pushing action at this stage and moves straight up from below.

If the number Nh of adjacent unripe fruit-free zones in the central zone is less than the threshold number Th, the gripper operates with a horizontal serpentine push; the snake operation involves three directions of movement, forward, left and right, the gripper pushing the immature fruit out in three directions; the general direction of the snake push operation is calculated based on the position of the unoccupied block according to the following formula:

wherein U is_jIs the vector of the jth unoccupied block within the largest set of adjacent unoccupied blocks, m is the total number of blocks within the largest set of adjacent unoccupied blocks; during a horizontal serpentine pushing operation, the device moves in the xy plane, with the resultant vector of the serpentine motion equal to Dz, and the amplitude ah of the serpentine motion and the number of pushes Nhp determined according to the particular grabbing scenario.

Specifically, the second stage is to surround the target fruit in the upper middle layer and the lower middle layer and distinguish the immature fruit in the middle layer; the upward serpentine pushing operation employed in the upper and lower middle tiers includes movement of the gripper in a substantially vertical direction toward the target fruit and from side to pass over the unripe fruit; the vertical direction passes through the center of the target fruit. Calculating a direction of push up Du _ z in the xy-plane based on the maximum number of blocks nu with no unripe fruit adjacent to the center block; if nu is greater than the threshold th, the calculation is performed according to the occupied block calculation direction Du _ z and the following formula, as in the single pushing operation in the bottom layer 9:

where au is a parameter for scaling the Du z norm, where au is 5 mm;

if Nu is less than the threshold Th, then the calculation uses the unoccupied block, which is calculated by the following formula:

where M is the intermediate vector for calculating Du _ z. The gripper moves along Du _ z and-Du _ z to push the sides of the unripe fruit apart.

Specifically, during the third stage, if the center block C in the top layer_COccupied, the gripper can then pull the target fruit to a gripping position with less unripe fruit;

only when the central zone C of the top layer_CThe dragging operation is performed when immature fruit exists. If the center block C_CWhen the fruit is empty, the clamp holder directly moves upwards to grab the target fruit; in order to avoid collisions between the gripper and the table, three blocks L close to the table are skipped_R、C_R、R_RTo calculate the drag direction, drag direction D in the xy plane_drCan be determined according to the following formula:

wherein U is_jIs the vector of the jth unoccupied block within the largest group of adjacent unoccupied blocks. The block used for the calculation is L_C、L_F、C_F、R_F、R_C. The parameter m is the total number of blocks within the largest group of adjacent unoccupied blocks. D_drIs scaled to l, where l is 50 mm. Where there is generally less immature fruit present, but if all the blocks are occupied by immature fruit, the direction of drag and C_FAligning; wherein the drag and push back operations move up the same height in the vertical direction.

According to the control method of the picking device based on the 3D visual perception, objects are actively distinguished from the target based on the visual perception, according to the distribution situation of immature fruits around the target fruit, single pushing operation or snake-shaped pushing operation consisting of a plurality of linear pushing operations are selected to distinguish the immature fruits below the target fruit and at the same height of the target, the more dense immature fruits can be processed due to multi-directional pushing, and the generated left and right movement can break the static contact force between the target fruit and the immature fruits, so that the clamp can more easily receive the target fruit; subsequent pulling operations, which include avoidance of the unripe fruit and active pushing of the unripe fruit away, to solve the problem of false capture of unripe fruit above the target fruit, wherein the gripper pulls the target fruit to a location with less unripe fruit and then pushes back to move the unripe fruit aside for further differentiation, can significantly improve picking performance, avoid damage to the target fruit as well as the unripe fruit, greatly improving picking efficiency.

Drawings

Fig. 1 is a flow chart of a picking device control method based on 3D visual perception of the present invention.

Fig. 2 is a flow chart of a picking device control method based on 3D visual perception of the present invention.

Fig. 3a-3c are schematic diagrams of image recognition of a picking device control method based on 3D visual perception according to the present invention.

Fig. 4a-4c are schematic diagrams of picking processes of a picking device control method based on 3D visual perception of the present invention.

Fig. 5a-5b are schematic diagrams of the picking process of a picking device control method based on 3D visual perception of the present invention.

Fig. 6a-6D are schematic diagrams of picking processes of a picking device control method based on 3D visual perception of the present invention.

Fig. 7a-7b are schematic diagrams of the picking process of a picking device control method based on 3D visual perception of the present invention.

Fig. 8a-8D are schematic diagrams of picking processes of a picking device control method based on 3D visual perception of the present invention.

Fig. 9a-9b are schematic diagrams of picking processes of a 3D visual perception based picking device control method of the present invention.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention.

As shown in fig. 1-2, a picking device control method based on 3D visual perception is characterized by comprising: the method comprises the following steps:

s6, calculating a picking path based on the distribution and quantity of the unripe fruit around the target fruit.

step S4 includes:

Wherein, in the image processing step, the first step is to remove adjacent noise points by using hue, saturation and intensity color threshold processing. Some sensing points from the irrigation pipe or rack are around the ripe fruit, the irrigation pipe and rack being at a distance behind the fruit. Inaccurate depth sensing results in some sensing points connecting to the front fruit, mistakenly considered as unripe fruit. To avoid this effect, the first step is to remove neighboring noise points by using hue, saturation and intensity color thresholding.

The second step is the detection and localization of the ripe fruit: a segmented convolutional neural network is used to identify and segment objects at the pixel level. Through the network, several masks are created for ripe fruit, with one mask representing the detected target fruit. By matching with the depth image, the occlusion is projected as a 3D point, obtaining the 3D position of the target fruit in the camera frame coordinates. Thereafter, coordinates are transformed from the camera frame to the picking device arm frame based on the camera external calibration device.

The third step is the calculation of the unripe fruit.

Wherein the region of interest comprises a region of a 3D point cloud containing the target fruit and potentially one or more unripe fruits. As shown in fig. 3, the region of interest is divided into four layers: a top layer 6, an upper intermediate layer 7, a lower intermediate layer 8 and a bottom layer 9; as shown in the top view of fig. 3, each layer of the region of interest is further divided into nine cubic areas. On each layer, the blocks form a 3 x 3 grid, and the center of the grid is positioned at the horizontal midpoint of the target fruit; such that the position in the xy plane is centered on the block C_CSurrounding the target fruit; the length and width of the outer eight peripheral blocks, etc. in plan viewThe length and width of the central block; the height of the top layer 6 and the bottom layer 9 is equal to one and two times the height of the middle layer zone respectively in front and left side views; the gripper is moved upwards to distinguish the immature fruit in the intermediate layer around the target fruit, the distribution of the immature fruit in the intermediate layer may vary in height.

To obtain a higher motion resolution, the middle layer is divided into an upper middle layer 7, a lower middle layer 8, and the motion in the motion of the middle layer is divided into two steps. The central area of the top layer 6 is lower than the other peripheral areas in the same layer, being 80% of the other peripheral areas. This is because the object segmentation method does not include green calyx. To avoid calyx being detected as immature fruit, the bottom of the central block was left in the top 1 blank.

To generate the partition path, each tile is assigned a representation from tile to center tile C_CA horizontal vector of the direction of (a). The direction of the vectors is determined by the position of the blocks such that all vectors point from the center of the respective block to the center block C_COf the center of (c). The number of points N in the point cloud area is used to determine whether immature fruit is present in the block. Using a camera of 1280 × 720 resolution, the threshold values of N of the top layer 6, the upper intermediate layer 7, the lower intermediate layer 8, and the bottom layer 9 are 200, 100, and 300, respectively.

The gripper operates in three distinct phases: in a first phase, the gripper grabs from below, moving the unripe fruit horizontally in the bottom layer 9; during a second stage, the gripper moves upward to encompass the target fruit and to differentiate immature fruit within the central layer; during the third stage, if the center block C in the top layer_COccupied, the gripper can then pull the target fruit to a gripping position with less unripe fruit.

In particular, the first stage is to distinguish the unripe fruit horizontally below the target fruit in the bottom layer 9, using the number Nh of central blocks adjacent to the non-unripe fruit blocks to determine whether to use a single pushing operation or a snaking operation.

As shown in fig. 5a, ignoring the central region, the solid arrows in the region indicate that the region is occupied by unripe fruit, while the blank arrows indicate the unoccupied region; nh is 5 and greater than a predetermined threshold Th of 4, so a single push operation is selected to push unripe fruit aside;

when a single pushing operation moves towards the unripe fruit, the direction of the pushing operation of the gripper is calculated according to the following formula, based on the position of the occupied zone:

where Oi is the vector of the ith occupied block within the largest contiguous occupied block group and n is the total number of blocks within the largest contiguous occupied block group. The parameter r is used to scale the Ds norm, which should ensure that the clamp is disengaged from the outside of the block, r 50 mm.

The arrows in fig. 5a show the calculated push direction for a single push operation; the gripper moves from the center of an unoccupied tile to the center of an occupied tile so that the gripper has the highest probability of pushing all tiles aside.

If only the center patch Cc is occupied, Ds is 0; the direction in which the gripper must move to push the unripe fruit is determined by calculating the shortest path from the current position of the gripper to the center of the center block CC.

If no unripe fruit is detected in the section, the gripper has no pushing action at this stage and moves straight up from below.

If the number Nh of adjacent unripe fruit-free zones in the central zone is less than the threshold number Th, the gripper employs a horizontal serpentine pushing operation. Fig. 5b shows an example of path calculation where a serpentine operation is selected to push the unripe fruit from one side to the other. The red arrow is the general direction of operation, while the blue arrow is a serpentine path. Since the snake operation involves movement in three directions, forward, left and right, the gripper can push the unripe fruit out in three directions.

The general direction of the snake push operation is calculated based on the position of the unoccupied block according to the following formula:

wherein U is_jIs the vector of the jth unoccupied block within the largest set of adjacent unoccupied blocks, and m is the total number of blocks within the largest set of adjacent unoccupied blocks. During a horizontal serpentine pushing operation, the device moves in the xy plane, with the resultant vector of the serpentine motion equal to Dz, and the amplitude ah of the serpentine motion and the number of pushes Nhp determined according to the particular grabbing scenario. For example, the effectiveness of these values may be affected by stem length, fruit weight, or damping ratio of the fruit, which are difficult to calculate. Where ah is 20mm and Nhp is 5.

In particular, the second stage is to surround the target fruit in the upper and lower

intermediate layers

7, 8 and to distinguish the immature fruit in the central layer.

As shown in fig. 6, the upward serpentine pushing action employed in the upper and lower

intermediate levels

7, 8 includes movement of the gripper in a substantially vertical direction toward the target fruit and side-to-side to pass over the unripe fruit. The vertical direction passes through the center of the target fruit. The direction of the upward push Du _ z in the xy-plane is calculated based on the maximum number nu of blocks with no unripe fruit that neighbor the center block. If nu is greater than the threshold th, the direction Du _ z is calculated according to the occupied block, as in the single push operation in the bottom layer 9.

Where au is a parameter for scaling the Du z norm, where au is 5 mm. If Nu is less than the threshold Th, as shown in fig. 7a, the calculation uses the unoccupied blocks, which is calculated by the following formula:

where M is the intermediate vector for calculating Du _ z. In fig. 7a, the gripper moves along Du _ z and-Du _ z to push the sides of the unripe fruit apart. The front view in fig. 7b shows that the clamper is gradually moved at the left or right middle point to pass over the lower middle layer 8 and the upper middle layer 7. The push number nup in each layer is set to 5.

Specifically, during the third stage, if the center block C in the top layer_COccupied, the gripper can then pull the target fruit to a gripping position with less unripe fruit.

As shown in fig. 8a, when immature fruit is present above the top layer 6 of the target fruit, sometimes the gripper wraps the immature fruit or the gripper damages the immature fruit when moving upward to catch the target fruit. In addition, the unripe fruit may prevent the wrapping sheet from closing, resulting in an uncut stem of the target fruit.

During the third stage, a pulling operation is employed that allows the gripper to grasp the target fruit without catching unwanted immature fruit.

As shown in fig. 8, the drag operation includes an upward drag step to move the target fruit to an area containing less unripe fruit and an upward push back step to push away the upper unripe fruit before closing the fingers, as shown in fig. 8 c. The push back up step is necessary, when in the pulling position shown in fig. 8b, the target fruit stem is inclined, making the fruit difficult to drop due to static forces and prone to damage when the gripper is moved further up towards the cutting position.

Only when the central zone C of the top layer_CThe dragging operation is performed when immature fruit exists. If the center block C_CWhen the fruit is empty, the clamp holder directly moves upwards to grab the target fruit. Fig. 9 is a diagram illustrating a calculation method of a drag operation corresponding to that in fig. 8. As shown in FIG. 9a, in order to avoid collision between the clamper and the table, three blocks L close to the table are skipped_R、C_R、R_RTo calculate the drag direction. Then, the dragging direction D in the xy plane_drCan be determined according to the following formula:

wherein U is_jIs the vector of the jth unoccupied block within the largest group of adjacent unoccupied blocks. The block used for the calculation is L_C、L_F、C_F、R_F、R_C. The parameter m is the total number of blocks within the largest group of adjacent unoccupied blocks. D_drIs scaled to l, where l is 50 mm. Where there is generally less immature fruit present, but if all the blocks are occupied by immature fruit, the direction of drag and C_FAnd (4) aligning. Fig. 9b shows a drag and push back step, where the drag and push back operation is moved up the same height in the vertical direction.

The construction of the convolutional neural network comprises the following steps:

step 1, data acquisition:

a rough generalized fruit identification model is established, and fruit images on branches are shot, so as to ensure the natural conditions of the orchard, wherein the images are captured without any specific limitation, namely, the light conditions, the shooting angle, the distance from the fruit and other conditions are completely unlimited.

Step 2, data preparation:

one major drawback of deep neural networks is that they rely heavily on large amounts of label data to provide good accuracy. These large data sets help the training phase to learn all the embedding parameters and minimize the risk of overfitting the network. Preparing such a large number of images is very laborious, expensive and time-consuming.

More training data can be created from the existing samples through data augmentation to effectively mitigate overfitting, making some transformation to the original image so that the new image still has the features of the original image and is visually classified into the same category. This will improve the versatility of the model since the same picture will not be exposed multiple times. In this study, an automatic data augmentation method, including image cropping, horizontal flipping, rotation, and brightness operations, was applied to generate 16 images from one image. After modifying the generated image and deleting the invalid image, e.g. the image cropped from the non-fruit area, the total number of images in the data set comprising the original data is obtained.

The expansion process is performed before loading the data onto the network, first the enhanced image can easily monitor any possible outlying images; secondly, the model is less loaded, thereby reducing training time. The enhanced image is then resized so that all input images have the same resolution.

The prepared data set is divided into two subsets for training and testing, with most of the data being randomly selected for training and the remainder of the data being selected for testing.

And 3, constructing a structure of the convolutional neural network:

the convolutional neural network is a subset of a deep network, and can automatically extract and classify the features of the RGB image. It has the characteristics of convolution operation, pool layer, nonlinear activation function and the like. The general topology of deep convolutional neural networks includes a series of convolutional and pooling layers, as well as some fully-connected layers. The convolutional neural network structure has three conversion layers, the model has three protection layers and two fully connected layers, the recognition speed is very high, and less memory is required for training.

Wherein the convolutional network layer: convolutional networks can learn translational invariants and spatial hierarchies, so convolutional networks can learn pre-identified patterns anywhere in the image, and learn increasingly complex patterns through successive layers. Convolutional networks are generally composed of three types of layers: convolutional layers, pooling layers, and full-link layers.

The convolutional layer is characterized by two parameters: the size of the filter and the number of calculation filters. All three convolutional layers use a 3 x 3 filter, and the number of filters is 16, 32 and 64, respectively.

To reduce the size of the feature map, one max-pooling layer is placed after each convolutional layer. The max pooling layer has no trainable parameters and can only reduce the number of features by selecting the maximum value in each window and discarding other values. The first pooling layer used 4x4 windows, while the second and third pooling layers used 2x2 windows.

The convolution operation is followed by a supplementary step of the rectification function, which further breaks the intrinsic linearity of the input image by outputting only non-negative values, in the convolution network all convolution layers as well as the first fully-connected layer use the rectification function as the activation function. The rectification function is:

and setting an activation function at the last layer of the model, wherein the activation function is as follows:

where z is a vector of K inputs and j represents an output unit. The activation function is necessary for multi-class, single-label classification, normalizing the input data to a probability distribution.

Before entering the classification phase, a global average pooling layer is employed. The global average pooling layer does not contain trainable parameters, so that the parameters can be obviously reduced, the model precision is improved, and the robustness of the model is obviously improved. The global average pooling layer is based on the average output of each feature map in the previous layer and the embedded flat layer. The global average pooling layer is used to compute a classification activation map. The classification activation map obtains a convolutional neural network that is used to identify regions of a particular class in an image, i.e., which regions in the image are associated with that class. The classification activation map for a class is determined by multiplying the output image of the last convolutional layer by the assigned weight after summing. The classification activation graph formula is as follows:

where Mc is the classification activation map for category c,

for the kth weight corresponding to class c, f_k(x, y) is associated with the kth feature map of the last convolutional layer.

All filter features in the entire convolutional application network are encoded as input data to the fully connected classifier layer. The full connection layer connects all the neurons of the previous layer and the current layer through a certain weight. The classification phase of the current model consists of two fully connected layers. The convolutional neural network predicts a class of an input image with a certain level of probability. The error of this process needs to be measured by means of a loss function. A categorical cross entropy loss function is used to evaluate the accuracy of the proposed model, which minimizes the difference between the output of the predicted probability distribution and the actual distribution of the target.

Step 4, network optimization:

the network is configured to load an input image with an associated label. The input image is divided into training data and test data; 80% were used for training and the remaining 20% were used for testing data. 10% of the training data set was used as the validation data set.

Increasing the network depth can improve overall performance, with the highest performance when the number of training samples is proportional to the network capacity. The performance of the three convolutional layers is the best and the structure is further optimized. The optimization process of the network is evaluated using different optimizers.

A robust model is built in a deep convolutional neural network, the model can identify a plurality of classes of fruits on branches through RGB images based on the deep convolutional neural network, and the deep convolutional neural model consists of three convolutional layers and three maximum pooling layers and is positioned behind a global average pooling layer and two full-connection layers. By using the global average pooling layer, the need for a flat layer is eliminated, the global precision of the data which is not viewed is improved, the classification index score is increased, the trainable total parameters are reduced, and the processing is faster. The network has high fruit identification rate and classification precision, high response speed, no influence of natural conditions and small calculated amount, and the fruit picking robot can quickly and accurately identify target fruits and interested areas by using the deep convolution neural network, so that the overlooked fruits are minimum, and the yield is highest.

According to the control method of the picking device based on the 3D visual perception, immature fruits are actively distinguished from a target based on the visual perception, single pushing operation or snake-shaped pushing operation consisting of a plurality of linear pushing operations is selected according to the distribution condition of the immature fruits around the target fruit so as to distinguish the immature fruits below the target fruit and at the same height of the target, the intensive immature fruits can be processed due to multi-directional pushing, and the generated left and right movement can break the static contact force between the target fruit and the immature fruit, so that a clamp can more easily receive the target fruit; subsequent pulling operations, which include avoidance of the unripe fruit and active pushing of the unripe fruit away, to solve the problem of false capture of unripe fruit above the target fruit, wherein the gripper pulls the target fruit to a location with less unripe fruit and then pushes back to move the unripe fruit aside for further differentiation, can significantly improve picking performance, avoid damage to the target fruit as well as the unripe fruit, greatly improving picking efficiency.

Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered in the protection scope of the present invention.

Claims

1. The invention discloses a picking device control method based on 3D visual perception, which is characterized by comprising the following steps: the method comprises the following steps:

s3, removing noise points from the background by the image acquisition device through 3D color threshold processing;

step S4 includes:

s42, transforming the coordinates from the camera frame to a device arm frame based on the camera external calibration device;

in step S5, a bounding box of a block in each region of interest in the point cloud is clipped in the point cloud library, and identification and calculation of immature fruits are performed on the corresponding block;

wherein the region of interest is a region of a 3D point cloud containing the target fruit and potentially one or more unripe fruits; the region of interest is divided into four layers: a top layer (6), an upper intermediate layer (7), a lower intermediate layer (8) and a bottom layer (9); each layer of the region of interest is divided into nine cubic blocks, the blocks form a 3 x 3 grid, and the center of the grid is positioned at the horizontal midpoint of the target fruit; such that the position in the xy plane is centered on the block C_CSurrounding the target fruit; the length and width of the outer eight peripheral blocks are equal to the length and width of the central block; the height of the top layer (6) and the bottom layer (9) is equal to one and two times the sum of the height of the upper intermediate layer (7), the lower intermediate layer (8) respectively, in a front view and in a left side view; the gripper (4) is moved upwards to distinguish the immature fruit around the target fruit in the upper intermediate layer (7) and the lower intermediate layer (8)The distribution of the immature fruits in the upper middle layer (7) and the lower middle layer (8) can be changed along the height direction.

2. A picking device control method based on 3D visual perception according to claim 1, characterized by grippers (4) operating in three different phases: in a first phase, the gripper (4) grips from below, moving the unripe fruit horizontally in the bottom layer (9); during a second stage, the gripper (4) moves upwards to surround the target fruit and to distinguish immature fruit within the upper intermediate level (7), the lower intermediate level (8); during the third phase, if the center block C in the top layer (6)_COccupied, the gripper (4) can then drag the target fruit to a gripping position with less unripe fruit.

3. A control method of a picking device based on 3D visual perception according to claims 1-2 characterized by that the first stage is to distinguish unripe fruit horizontally under the target fruit in the bottom layer (9), using the number Nh of central block adjacent to the block without unripe fruit to determine whether to use single push operation or snake operation;

4. A method of controlling a picking mechanism based on 3D visual perception according to claims 1-3 characterized by the second stage of surrounding the target fruit in the upper and lower intermediate layers 7, 8 and differentiating the unripe fruit in the central layer; the upward serpentine pushing operation employed in the upper and lower intermediate levels 7, 8 involves movement of the gripper in a substantially vertical direction towards the target fruit and from side to pass over the immature fruit; the vertical direction passes through the center of the target fruit. Calculating a direction of push up Du _ z in the xy-plane based on the maximum number of blocks nu with no unripe fruit adjacent to the center block; if nu is greater than the threshold th, the calculation is performed according to the occupied block calculation direction Du _ z and the following formula, as in the single pushing operation in the bottom layer 9:

where au is a parameter for scaling the Du z norm, where au is 5 mm;

where M is the intermediate vector for calculating Du _ z; the gripper moves along Du _ z and-Du _ z to push the sides of the unripe fruit apart.

5. A method of controlling a picking device based on 3D visual perception according to claims 1-4 characterized by during the third phase if the center area C in the top layer is_COccupied, the gripper can then pull the target fruit to a gripping position with less unripe fruit;