CN112528826A - Picking device control method based on 3D visual perception - Google Patents

Picking device control method based on 3D visual perception Download PDF

Info

Publication number
CN112528826A
CN112528826A CN202011414617.XA CN202011414617A CN112528826A CN 112528826 A CN112528826 A CN 112528826A CN 202011414617 A CN202011414617 A CN 202011414617A CN 112528826 A CN112528826 A CN 112528826A
Authority
CN
China
Prior art keywords
fruit
target
gripper
block
immature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011414617.XA
Other languages
Chinese (zh)
Other versions
CN112528826B (en
Inventor
唐玉新
唐双凌
徐陶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Academy of Agricultural Sciences
Original Assignee
Jiangsu Academy of Agricultural Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Academy of Agricultural Sciences filed Critical Jiangsu Academy of Agricultural Sciences
Priority to CN202011414617.XA priority Critical patent/CN112528826B/en
Publication of CN112528826A publication Critical patent/CN112528826A/en
Application granted granted Critical
Publication of CN112528826B publication Critical patent/CN112528826B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/68Food, e.g. fruit or vegetables
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Harvesting Machines For Specific Crops (AREA)

Abstract

The invention provides a picking device control method based on 3D visual perception, which comprises the following steps: s1, the control device controls the picking device to walk along the lower part of the fruit support needing picking; s2, identifying the target fruit by the control device through the image acquisition device of the picking device, and S3, removing noise points from the background by the image acquisition device through 3D color threshold processing; s4, detecting and positioning the mature fruits by using deep learning through a control device; s5, selecting an interested area around the target fruit to determine the existence of immature fruit; and S6, calculating the picking path based on the distribution and the quantity of the immature fruits around the target fruits. The invention actively distinguishes the immature fruit from the target based on visual perception, and selects a single pushing operation or a snake-shaped pushing operation consisting of a plurality of linear pushing operations according to the distribution condition of the immature fruit around the target fruit so as to distinguish the immature fruit under the target fruit and at the same height of the target, thereby remarkably improving the sorting performance and greatly improving the picking efficiency.

Description

Picking device control method based on 3D visual perception
Technical Field
The application relates to the field of intelligent robots, in particular to a control method of a picking device based on 3D visual perception.
Background
The fruit picking mode in China is generally manual picking, the labor cost accounts for 50% -70% of the total cost of fruits and vegetables, and the picking mode is high in cost, low in efficiency and difficult to realize high-altitude operation. The fruit picking robot is a device which replaces manpower and can automatically pick fruits. At present, domestic fruit picking robots are still in the initial stage of development, and most of the fruit picking robots cannot meet the requirements of fruit growers on picking fruits. At present, the end effector of the fruit picking robot has no buffer stage when clamping fruits, and the fruits are delicate and easy to damage the fruits. In addition, the fruit picking robot needs to selectively harvest fruits due to the difference in the mature period of the fruits. The existing fruit picking robot has great defects in the aspects of identifying and selecting single mature fruits, and not accidentally damaging or accidentally selecting immature fruits for picking. Although there are also methods for exploring the search space of the end effector for feasible trajectories by image recognition and using search algorithms, where each step of the trajectory is planned by the collision detector. Most of the methods are passive methods with the goal of avoiding unripe fruit or other parts without changing the environment. However, unripe fruit is not always avoidable, and when the target fruit is completely surrounded by unripe fruit, there may be a problem in that the end effector cannot pick the fruit by finding a way to avoid all the unripe fruit.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention provides a picking device control method based on 3D visual perception.
The invention discloses a picking device control method based on 3D visual perception, which is characterized by comprising the following steps: the method comprises the following steps:
s1, the control device controls the picking device to walk along the lower part of the fruit support needing picking;
s2, the control device identifies the target fruit through the image acquisition device of the picking device,
s3, the image acquisition device removes noise points from the background through 3D color threshold processing;
s4, detecting and positioning the mature fruits by using deep learning through a control device;
s5, selecting an interested area around the target fruit to determine the existence of immature fruit;
and S6, calculating the picking path based on the distribution and the quantity of the immature fruits around the target fruits.
In step S2, adjacent noise points are removed by using hue, saturation and intensity color threshold processing,
step S4 includes:
s41, identifying and segmenting the object at the pixel level by using a segmented convolutional neural network; creating, over the network, a number of masks for the ripe fruit, wherein one mask represents the detected target fruit; projecting the mask into 3D points by matching with the depth image to obtain the 3D position of the target fruit in the coordinates of the camera frame;
s42, transforming the coordinates from the camera frame to an arm frame of the picking device based on the camera external calibration device;
in step S5, a bounding box of the block in each region of interest in the point cloud is clipped using the point cloud library, and identification and calculation of immature fruits are performed on the corresponding block.
Wherein the region of interest is a region of a 3D point cloud containing the target fruit and potentially one or more unripe fruits; the region of interest is divided into four layers: a top layer, an upper middle layer, a lower middle layer and a bottom layer; each layer of the region of interest is divided into nine cubic blocks, the blocks form a 3 x 3 grid, and the center of the grid is positioned at the horizontal midpoint of the target fruit; such that the position in the xy plane is centered on the block CCSurrounding the target fruit; the length and width of the outer eight peripheral blocks are equal to the length and width of the central block; the top layer and the bottom layer have a height equal to that of the upper middle layer and the lower middle layer, respectively, in the front view and the left side viewOne and two times the sum of heights; the gripper is moved upward to distinguish the immature fruit around the target fruit in the upper and lower intermediate levels, and the distribution of the immature fruit in the upper and lower intermediate levels may vary in the height direction.
In particular, the gripper operates in three distinct phases: in a first phase, the gripper grabs from below, the gripper moving the unripe fruit horizontally in the bottom layer; during a second stage, the gripper moves upward to encompass the target fruit and distinguish between immature fruit in the upper and lower intermediate levels; during the third stage, if the center block C in the top layerCOccupied, the gripper can then pull the target fruit to a gripping position with less unripe fruit.
In particular, the first stage is to distinguish the unripe fruit horizontally below the target fruit in the bottom layer, using the number Nh of central blocks adjacent to the unripe fruit free blocks to determine whether to use a single push operation or a snake operation;
ignoring the central region, the solid arrows in the region indicate that the region is occupied by unripe fruit, and the blank arrows indicate unoccupied regions; nh is 5 and greater than a predetermined threshold Th of 4, selecting a single push operation to push unripe fruit aside; when a single pushing operation moves towards the unripe fruit, the direction of the pushing operation of the gripper is calculated according to the following formula, based on the position of the occupied zone:
Figure BDA0002816176280000031
where Oi is the vector of the ith occupied block within the largest adjacent occupied block group, and n is the total number of blocks within the largest adjacent occupied block group; the parameter r is used to scale the Ds norm, which should ensure that the clamp is disengaged from the outside of the block, r is 50 mm;
the gripper moves from the centre of an unoccupied tile to the centre of an occupied tile, so that the gripper has the highest probability of pushing all tiles aside;
if only the center patch Cc is occupied, Ds is 0; the direction in which the gripper must move to push the unripe fruit is determined by calculating the shortest path from the current position of the gripper to the center of the center block CC. If no unripe fruit is detected in the section, the gripper has no pushing action at this stage and moves straight up from below.
If the number Nh of adjacent unripe fruit-free zones in the central zone is less than the threshold number Th, the gripper operates with a horizontal serpentine push; the snake operation involves three directions of movement, forward, left and right, the gripper pushing the immature fruit out in three directions; the general direction of the snake push operation is calculated based on the position of the unoccupied block according to the following formula:
Figure BDA0002816176280000032
wherein U isjIs the vector of the jth unoccupied block within the largest set of adjacent unoccupied blocks, m is the total number of blocks within the largest set of adjacent unoccupied blocks; during a horizontal serpentine pushing operation, the device moves in the xy plane, with the resultant vector of the serpentine motion equal to Dz, and the amplitude ah of the serpentine motion and the number of pushes Nhp determined according to the particular grabbing scenario.
Specifically, the second stage is to surround the target fruit in the upper middle layer and the lower middle layer and distinguish the immature fruit in the middle layer; the upward serpentine pushing operation employed in the upper and lower middle tiers includes movement of the gripper in a substantially vertical direction toward the target fruit and from side to pass over the unripe fruit; the vertical direction passes through the center of the target fruit. Calculating a direction of push up Du _ z in the xy-plane based on the maximum number of blocks nu with no unripe fruit adjacent to the center block; if nu is greater than the threshold th, the calculation is performed according to the occupied block calculation direction Du _ z and the following formula, as in the single pushing operation in the bottom layer 9:
Figure BDA0002816176280000041
where au is a parameter for scaling the Du z norm, where au is 5 mm;
if Nu is less than the threshold Th, then the calculation uses the unoccupied block, which is calculated by the following formula:
Figure BDA0002816176280000042
where M is the intermediate vector for calculating Du _ z. The gripper moves along Du _ z and-Du _ z to push the sides of the unripe fruit apart.
Specifically, during the third stage, if the center block C in the top layerCOccupied, the gripper can then pull the target fruit to a gripping position with less unripe fruit;
only when the central zone C of the top layerCThe dragging operation is performed when immature fruit exists. If the center block CCWhen the fruit is empty, the clamp holder directly moves upwards to grab the target fruit; in order to avoid collisions between the gripper and the table, three blocks L close to the table are skippedR、CR、RRTo calculate the drag direction, drag direction D in the xy planedrCan be determined according to the following formula:
Figure BDA0002816176280000043
wherein U isjIs the vector of the jth unoccupied block within the largest group of adjacent unoccupied blocks. The block used for the calculation is LC、LF、CF、RF、RC. The parameter m is the total number of blocks within the largest group of adjacent unoccupied blocks. DdrIs scaled to l, where l is 50 mm. Where there is generally less immature fruit present, but if all the blocks are occupied by immature fruit, the direction of drag and CFAligning; wherein the drag and push back operations move up the same height in the vertical direction.
According to the control method of the picking device based on the 3D visual perception, objects are actively distinguished from the target based on the visual perception, according to the distribution situation of immature fruits around the target fruit, single pushing operation or snake-shaped pushing operation consisting of a plurality of linear pushing operations are selected to distinguish the immature fruits below the target fruit and at the same height of the target, the more dense immature fruits can be processed due to multi-directional pushing, and the generated left and right movement can break the static contact force between the target fruit and the immature fruits, so that the clamp can more easily receive the target fruit; subsequent pulling operations, which include avoidance of the unripe fruit and active pushing of the unripe fruit away, to solve the problem of false capture of unripe fruit above the target fruit, wherein the gripper pulls the target fruit to a location with less unripe fruit and then pushes back to move the unripe fruit aside for further differentiation, can significantly improve picking performance, avoid damage to the target fruit as well as the unripe fruit, greatly improving picking efficiency.
Drawings
Fig. 1 is a flow chart of a picking device control method based on 3D visual perception of the present invention.
Fig. 2 is a flow chart of a picking device control method based on 3D visual perception of the present invention.
Fig. 3a-3c are schematic diagrams of image recognition of a picking device control method based on 3D visual perception according to the present invention.
Fig. 4a-4c are schematic diagrams of picking processes of a picking device control method based on 3D visual perception of the present invention.
Fig. 5a-5b are schematic diagrams of the picking process of a picking device control method based on 3D visual perception of the present invention.
Fig. 6a-6D are schematic diagrams of picking processes of a picking device control method based on 3D visual perception of the present invention.
Fig. 7a-7b are schematic diagrams of the picking process of a picking device control method based on 3D visual perception of the present invention.
Fig. 8a-8D are schematic diagrams of picking processes of a picking device control method based on 3D visual perception of the present invention.
Fig. 9a-9b are schematic diagrams of picking processes of a 3D visual perception based picking device control method of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention.
As shown in fig. 1-2, a picking device control method based on 3D visual perception is characterized by comprising: the method comprises the following steps:
s1, the control device controls the picking device to walk along the lower part of the fruit support needing picking;
s2, the control device identifies the target fruit through the image acquisition device of the picking device,
s3, the image acquisition device removes noise points from the background through 3D color threshold processing;
s4, detecting and positioning the mature fruits by using deep learning through a control device;
s5, selecting an interested area around the target fruit to determine the existence of immature fruit;
s6, calculating a picking path based on the distribution and quantity of the unripe fruit around the target fruit.
In step S2, adjacent noise points are removed by using hue, saturation and intensity color threshold processing,
step S4 includes:
s41, identifying and segmenting the object at the pixel level by using a segmented convolutional neural network; creating, over the network, a number of masks for the ripe fruit, wherein one mask represents the detected target fruit; projecting the mask into 3D points by matching with the depth image to obtain the 3D position of the target fruit in the coordinates of the camera frame;
s42, transforming the coordinates from the camera frame to an arm frame of the picking device based on the camera external calibration device;
in step S5, a bounding box of the block in each region of interest in the point cloud is clipped using the point cloud library, and identification and calculation of immature fruits are performed on the corresponding block.
Wherein, in the image processing step, the first step is to remove adjacent noise points by using hue, saturation and intensity color threshold processing. Some sensing points from the irrigation pipe or rack are around the ripe fruit, the irrigation pipe and rack being at a distance behind the fruit. Inaccurate depth sensing results in some sensing points connecting to the front fruit, mistakenly considered as unripe fruit. To avoid this effect, the first step is to remove neighboring noise points by using hue, saturation and intensity color thresholding.
The second step is the detection and localization of the ripe fruit: a segmented convolutional neural network is used to identify and segment objects at the pixel level. Through the network, several masks are created for ripe fruit, with one mask representing the detected target fruit. By matching with the depth image, the occlusion is projected as a 3D point, obtaining the 3D position of the target fruit in the camera frame coordinates. Thereafter, coordinates are transformed from the camera frame to the picking device arm frame based on the camera external calibration device.
The third step is the calculation of the unripe fruit.
Wherein the region of interest comprises a region of a 3D point cloud containing the target fruit and potentially one or more unripe fruits. As shown in fig. 3, the region of interest is divided into four layers: a top layer 6, an upper intermediate layer 7, a lower intermediate layer 8 and a bottom layer 9; as shown in the top view of fig. 3, each layer of the region of interest is further divided into nine cubic areas. On each layer, the blocks form a 3 x 3 grid, and the center of the grid is positioned at the horizontal midpoint of the target fruit; such that the position in the xy plane is centered on the block CCSurrounding the target fruit; the length and width of the outer eight peripheral blocks, etc. in plan viewThe length and width of the central block; the height of the top layer 6 and the bottom layer 9 is equal to one and two times the height of the middle layer zone respectively in front and left side views; the gripper is moved upwards to distinguish the immature fruit in the intermediate layer around the target fruit, the distribution of the immature fruit in the intermediate layer may vary in height.
To obtain a higher motion resolution, the middle layer is divided into an upper middle layer 7, a lower middle layer 8, and the motion in the motion of the middle layer is divided into two steps. The central area of the top layer 6 is lower than the other peripheral areas in the same layer, being 80% of the other peripheral areas. This is because the object segmentation method does not include green calyx. To avoid calyx being detected as immature fruit, the bottom of the central block was left in the top 1 blank.
To generate the partition path, each tile is assigned a representation from tile to center tile CCA horizontal vector of the direction of (a). The direction of the vectors is determined by the position of the blocks such that all vectors point from the center of the respective block to the center block CCOf the center of (c). The number of points N in the point cloud area is used to determine whether immature fruit is present in the block. Using a camera of 1280 × 720 resolution, the threshold values of N of the top layer 6, the upper intermediate layer 7, the lower intermediate layer 8, and the bottom layer 9 are 200, 100, and 300, respectively.
The gripper operates in three distinct phases: in a first phase, the gripper grabs from below, moving the unripe fruit horizontally in the bottom layer 9; during a second stage, the gripper moves upward to encompass the target fruit and to differentiate immature fruit within the central layer; during the third stage, if the center block C in the top layerCOccupied, the gripper can then pull the target fruit to a gripping position with less unripe fruit.
In particular, the first stage is to distinguish the unripe fruit horizontally below the target fruit in the bottom layer 9, using the number Nh of central blocks adjacent to the non-unripe fruit blocks to determine whether to use a single pushing operation or a snaking operation.
As shown in fig. 5a, ignoring the central region, the solid arrows in the region indicate that the region is occupied by unripe fruit, while the blank arrows indicate the unoccupied region; nh is 5 and greater than a predetermined threshold Th of 4, so a single push operation is selected to push unripe fruit aside;
when a single pushing operation moves towards the unripe fruit, the direction of the pushing operation of the gripper is calculated according to the following formula, based on the position of the occupied zone:
Figure BDA0002816176280000071
where Oi is the vector of the ith occupied block within the largest contiguous occupied block group and n is the total number of blocks within the largest contiguous occupied block group. The parameter r is used to scale the Ds norm, which should ensure that the clamp is disengaged from the outside of the block, r 50 mm.
The arrows in fig. 5a show the calculated push direction for a single push operation; the gripper moves from the center of an unoccupied tile to the center of an occupied tile so that the gripper has the highest probability of pushing all tiles aside.
If only the center patch Cc is occupied, Ds is 0; the direction in which the gripper must move to push the unripe fruit is determined by calculating the shortest path from the current position of the gripper to the center of the center block CC.
If no unripe fruit is detected in the section, the gripper has no pushing action at this stage and moves straight up from below.
If the number Nh of adjacent unripe fruit-free zones in the central zone is less than the threshold number Th, the gripper employs a horizontal serpentine pushing operation. Fig. 5b shows an example of path calculation where a serpentine operation is selected to push the unripe fruit from one side to the other. The red arrow is the general direction of operation, while the blue arrow is a serpentine path. Since the snake operation involves movement in three directions, forward, left and right, the gripper can push the unripe fruit out in three directions.
The general direction of the snake push operation is calculated based on the position of the unoccupied block according to the following formula:
Figure BDA0002816176280000081
wherein U isjIs the vector of the jth unoccupied block within the largest set of adjacent unoccupied blocks, and m is the total number of blocks within the largest set of adjacent unoccupied blocks. During a horizontal serpentine pushing operation, the device moves in the xy plane, with the resultant vector of the serpentine motion equal to Dz, and the amplitude ah of the serpentine motion and the number of pushes Nhp determined according to the particular grabbing scenario. For example, the effectiveness of these values may be affected by stem length, fruit weight, or damping ratio of the fruit, which are difficult to calculate. Where ah is 20mm and Nhp is 5.
In particular, the second stage is to surround the target fruit in the upper and lower intermediate layers 7, 8 and to distinguish the immature fruit in the central layer.
As shown in fig. 6, the upward serpentine pushing action employed in the upper and lower intermediate levels 7, 8 includes movement of the gripper in a substantially vertical direction toward the target fruit and side-to-side to pass over the unripe fruit. The vertical direction passes through the center of the target fruit. The direction of the upward push Du _ z in the xy-plane is calculated based on the maximum number nu of blocks with no unripe fruit that neighbor the center block. If nu is greater than the threshold th, the direction Du _ z is calculated according to the occupied block, as in the single push operation in the bottom layer 9.
Figure BDA0002816176280000082
Where au is a parameter for scaling the Du z norm, where au is 5 mm. If Nu is less than the threshold Th, as shown in fig. 7a, the calculation uses the unoccupied blocks, which is calculated by the following formula:
Figure BDA0002816176280000091
where M is the intermediate vector for calculating Du _ z. In fig. 7a, the gripper moves along Du _ z and-Du _ z to push the sides of the unripe fruit apart. The front view in fig. 7b shows that the clamper is gradually moved at the left or right middle point to pass over the lower middle layer 8 and the upper middle layer 7. The push number nup in each layer is set to 5.
Specifically, during the third stage, if the center block C in the top layerCOccupied, the gripper can then pull the target fruit to a gripping position with less unripe fruit.
As shown in fig. 8a, when immature fruit is present above the top layer 6 of the target fruit, sometimes the gripper wraps the immature fruit or the gripper damages the immature fruit when moving upward to catch the target fruit. In addition, the unripe fruit may prevent the wrapping sheet from closing, resulting in an uncut stem of the target fruit.
During the third stage, a pulling operation is employed that allows the gripper to grasp the target fruit without catching unwanted immature fruit.
As shown in fig. 8, the drag operation includes an upward drag step to move the target fruit to an area containing less unripe fruit and an upward push back step to push away the upper unripe fruit before closing the fingers, as shown in fig. 8 c. The push back up step is necessary, when in the pulling position shown in fig. 8b, the target fruit stem is inclined, making the fruit difficult to drop due to static forces and prone to damage when the gripper is moved further up towards the cutting position.
Only when the central zone C of the top layerCThe dragging operation is performed when immature fruit exists. If the center block CCWhen the fruit is empty, the clamp holder directly moves upwards to grab the target fruit. Fig. 9 is a diagram illustrating a calculation method of a drag operation corresponding to that in fig. 8. As shown in FIG. 9a, in order to avoid collision between the clamper and the table, three blocks L close to the table are skippedR、CR、RRTo calculate the drag direction. Then, the dragging direction D in the xy planedrCan be determined according to the following formula:
Figure BDA0002816176280000092
wherein U isjIs the vector of the jth unoccupied block within the largest group of adjacent unoccupied blocks. The block used for the calculation is LC、LF、CF、RF、RC. The parameter m is the total number of blocks within the largest group of adjacent unoccupied blocks. DdrIs scaled to l, where l is 50 mm. Where there is generally less immature fruit present, but if all the blocks are occupied by immature fruit, the direction of drag and CFAnd (4) aligning. Fig. 9b shows a drag and push back step, where the drag and push back operation is moved up the same height in the vertical direction.
The construction of the convolutional neural network comprises the following steps:
step 1, data acquisition:
a rough generalized fruit identification model is established, and fruit images on branches are shot, so as to ensure the natural conditions of the orchard, wherein the images are captured without any specific limitation, namely, the light conditions, the shooting angle, the distance from the fruit and other conditions are completely unlimited.
Step 2, data preparation:
one major drawback of deep neural networks is that they rely heavily on large amounts of label data to provide good accuracy. These large data sets help the training phase to learn all the embedding parameters and minimize the risk of overfitting the network. Preparing such a large number of images is very laborious, expensive and time-consuming.
More training data can be created from the existing samples through data augmentation to effectively mitigate overfitting, making some transformation to the original image so that the new image still has the features of the original image and is visually classified into the same category. This will improve the versatility of the model since the same picture will not be exposed multiple times. In this study, an automatic data augmentation method, including image cropping, horizontal flipping, rotation, and brightness operations, was applied to generate 16 images from one image. After modifying the generated image and deleting the invalid image, e.g. the image cropped from the non-fruit area, the total number of images in the data set comprising the original data is obtained.
The expansion process is performed before loading the data onto the network, first the enhanced image can easily monitor any possible outlying images; secondly, the model is less loaded, thereby reducing training time. The enhanced image is then resized so that all input images have the same resolution.
The prepared data set is divided into two subsets for training and testing, with most of the data being randomly selected for training and the remainder of the data being selected for testing.
And 3, constructing a structure of the convolutional neural network:
the convolutional neural network is a subset of a deep network, and can automatically extract and classify the features of the RGB image. It has the characteristics of convolution operation, pool layer, nonlinear activation function and the like. The general topology of deep convolutional neural networks includes a series of convolutional and pooling layers, as well as some fully-connected layers. The convolutional neural network structure has three conversion layers, the model has three protection layers and two fully connected layers, the recognition speed is very high, and less memory is required for training.
Wherein the convolutional network layer: convolutional networks can learn translational invariants and spatial hierarchies, so convolutional networks can learn pre-identified patterns anywhere in the image, and learn increasingly complex patterns through successive layers. Convolutional networks are generally composed of three types of layers: convolutional layers, pooling layers, and full-link layers.
The convolutional layer is characterized by two parameters: the size of the filter and the number of calculation filters. All three convolutional layers use a 3 x 3 filter, and the number of filters is 16, 32 and 64, respectively.
To reduce the size of the feature map, one max-pooling layer is placed after each convolutional layer. The max pooling layer has no trainable parameters and can only reduce the number of features by selecting the maximum value in each window and discarding other values. The first pooling layer used 4x4 windows, while the second and third pooling layers used 2x2 windows.
The convolution operation is followed by a supplementary step of the rectification function, which further breaks the intrinsic linearity of the input image by outputting only non-negative values, in the convolution network all convolution layers as well as the first fully-connected layer use the rectification function as the activation function. The rectification function is:
Figure BDA0002816176280000111
and setting an activation function at the last layer of the model, wherein the activation function is as follows:
Figure BDA0002816176280000112
where z is a vector of K inputs and j represents an output unit. The activation function is necessary for multi-class, single-label classification, normalizing the input data to a probability distribution.
Before entering the classification phase, a global average pooling layer is employed. The global average pooling layer does not contain trainable parameters, so that the parameters can be obviously reduced, the model precision is improved, and the robustness of the model is obviously improved. The global average pooling layer is based on the average output of each feature map in the previous layer and the embedded flat layer. The global average pooling layer is used to compute a classification activation map. The classification activation map obtains a convolutional neural network that is used to identify regions of a particular class in an image, i.e., which regions in the image are associated with that class. The classification activation map for a class is determined by multiplying the output image of the last convolutional layer by the assigned weight after summing. The classification activation graph formula is as follows:
Figure BDA0002816176280000113
where Mc is the classification activation map for category c,
Figure BDA0002816176280000114
for the kth weight corresponding to class c, fk(x, y) is associated with the kth feature map of the last convolutional layer.
All filter features in the entire convolutional application network are encoded as input data to the fully connected classifier layer. The full connection layer connects all the neurons of the previous layer and the current layer through a certain weight. The classification phase of the current model consists of two fully connected layers. The convolutional neural network predicts a class of an input image with a certain level of probability. The error of this process needs to be measured by means of a loss function. A categorical cross entropy loss function is used to evaluate the accuracy of the proposed model, which minimizes the difference between the output of the predicted probability distribution and the actual distribution of the target.
Step 4, network optimization:
the network is configured to load an input image with an associated label. The input image is divided into training data and test data; 80% were used for training and the remaining 20% were used for testing data. 10% of the training data set was used as the validation data set.
Increasing the network depth can improve overall performance, with the highest performance when the number of training samples is proportional to the network capacity. The performance of the three convolutional layers is the best and the structure is further optimized. The optimization process of the network is evaluated using different optimizers.
A robust model is built in a deep convolutional neural network, the model can identify a plurality of classes of fruits on branches through RGB images based on the deep convolutional neural network, and the deep convolutional neural model consists of three convolutional layers and three maximum pooling layers and is positioned behind a global average pooling layer and two full-connection layers. By using the global average pooling layer, the need for a flat layer is eliminated, the global precision of the data which is not viewed is improved, the classification index score is increased, the trainable total parameters are reduced, and the processing is faster. The network has high fruit identification rate and classification precision, high response speed, no influence of natural conditions and small calculated amount, and the fruit picking robot can quickly and accurately identify target fruits and interested areas by using the deep convolution neural network, so that the overlooked fruits are minimum, and the yield is highest.
According to the control method of the picking device based on the 3D visual perception, immature fruits are actively distinguished from a target based on the visual perception, single pushing operation or snake-shaped pushing operation consisting of a plurality of linear pushing operations is selected according to the distribution condition of the immature fruits around the target fruit so as to distinguish the immature fruits below the target fruit and at the same height of the target, the intensive immature fruits can be processed due to multi-directional pushing, and the generated left and right movement can break the static contact force between the target fruit and the immature fruit, so that a clamp can more easily receive the target fruit; subsequent pulling operations, which include avoidance of the unripe fruit and active pushing of the unripe fruit away, to solve the problem of false capture of unripe fruit above the target fruit, wherein the gripper pulls the target fruit to a location with less unripe fruit and then pushes back to move the unripe fruit aside for further differentiation, can significantly improve picking performance, avoid damage to the target fruit as well as the unripe fruit, greatly improving picking efficiency.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered in the protection scope of the present invention.

Claims (5)

1. The invention discloses a picking device control method based on 3D visual perception, which is characterized by comprising the following steps: the method comprises the following steps:
s1, the control device controls the picking device to walk along the lower part of the fruit support needing picking;
s2, the control device identifies the target fruit through the image acquisition device of the picking device,
s3, removing noise points from the background by the image acquisition device through 3D color threshold processing;
s4, detecting and positioning the mature fruits by using deep learning through a control device;
s5, selecting an interested area around the target fruit to determine the existence of immature fruit;
and S6, calculating the picking path based on the distribution and the quantity of the immature fruits around the target fruits.
In step S2, adjacent noise points are removed by using hue, saturation and intensity color threshold processing,
step S4 includes:
s41, identifying and segmenting the object at the pixel level by using a segmented convolutional neural network; creating, over the network, a number of masks for the ripe fruit, wherein one mask represents the detected target fruit; projecting the mask into 3D points by matching with the depth image to obtain the 3D position of the target fruit in the coordinates of the camera frame;
s42, transforming the coordinates from the camera frame to a device arm frame based on the camera external calibration device;
in step S5, a bounding box of a block in each region of interest in the point cloud is clipped in the point cloud library, and identification and calculation of immature fruits are performed on the corresponding block;
wherein the region of interest is a region of a 3D point cloud containing the target fruit and potentially one or more unripe fruits; the region of interest is divided into four layers: a top layer (6), an upper intermediate layer (7), a lower intermediate layer (8) and a bottom layer (9); each layer of the region of interest is divided into nine cubic blocks, the blocks form a 3 x 3 grid, and the center of the grid is positioned at the horizontal midpoint of the target fruit; such that the position in the xy plane is centered on the block CCSurrounding the target fruit; the length and width of the outer eight peripheral blocks are equal to the length and width of the central block; the height of the top layer (6) and the bottom layer (9) is equal to one and two times the sum of the height of the upper intermediate layer (7), the lower intermediate layer (8) respectively, in a front view and in a left side view; the gripper (4) is moved upwards to distinguish the immature fruit around the target fruit in the upper intermediate layer (7) and the lower intermediate layer (8)The distribution of the immature fruits in the upper middle layer (7) and the lower middle layer (8) can be changed along the height direction.
2. A picking device control method based on 3D visual perception according to claim 1, characterized by grippers (4) operating in three different phases: in a first phase, the gripper (4) grips from below, moving the unripe fruit horizontally in the bottom layer (9); during a second stage, the gripper (4) moves upwards to surround the target fruit and to distinguish immature fruit within the upper intermediate level (7), the lower intermediate level (8); during the third phase, if the center block C in the top layer (6)COccupied, the gripper (4) can then drag the target fruit to a gripping position with less unripe fruit.
3. A control method of a picking device based on 3D visual perception according to claims 1-2 characterized by that the first stage is to distinguish unripe fruit horizontally under the target fruit in the bottom layer (9), using the number Nh of central block adjacent to the block without unripe fruit to determine whether to use single push operation or snake operation;
ignoring the central region, the solid arrows in the region indicate that the region is occupied by unripe fruit, and the blank arrows indicate unoccupied regions; nh is 5 and greater than a predetermined threshold Th of 4, selecting a single push operation to push unripe fruit aside; when a single pushing operation moves towards the unripe fruit, the direction of the pushing operation of the gripper is calculated according to the following formula, based on the position of the occupied zone:
Figure FDA0002816176270000021
where Oi is the vector of the ith occupied block within the largest adjacent occupied block group, and n is the total number of blocks within the largest adjacent occupied block group; the parameter r is used to scale the Ds norm, which should ensure that the clamp is disengaged from the outside of the block, r is 50 mm;
the gripper moves from the centre of an unoccupied tile to the centre of an occupied tile, so that the gripper has the highest probability of pushing all tiles aside;
if only the center patch Cc is occupied, Ds is 0; the direction in which the gripper must move to push the unripe fruit is determined by calculating the shortest path from the current position of the gripper to the center of the center block CC. If no unripe fruit is detected in the section, the gripper has no pushing action at this stage and moves straight up from below.
If the number Nh of adjacent unripe fruit-free zones in the central zone is less than the threshold number Th, the gripper operates with a horizontal serpentine push; the snake operation involves three directions of movement, forward, left and right, the gripper pushing the immature fruit out in three directions; the general direction of the snake push operation is calculated based on the position of the unoccupied block according to the following formula:
Figure FDA0002816176270000022
wherein U isjIs the vector of the jth unoccupied block within the largest set of adjacent unoccupied blocks, m is the total number of blocks within the largest set of adjacent unoccupied blocks; during a horizontal serpentine pushing operation, the device moves in the xy plane, with the resultant vector of the serpentine motion equal to Dz, and the amplitude ah of the serpentine motion and the number of pushes Nhp determined according to the particular grabbing scenario.
4. A method of controlling a picking mechanism based on 3D visual perception according to claims 1-3 characterized by the second stage of surrounding the target fruit in the upper and lower intermediate layers 7, 8 and differentiating the unripe fruit in the central layer; the upward serpentine pushing operation employed in the upper and lower intermediate levels 7, 8 involves movement of the gripper in a substantially vertical direction towards the target fruit and from side to pass over the immature fruit; the vertical direction passes through the center of the target fruit. Calculating a direction of push up Du _ z in the xy-plane based on the maximum number of blocks nu with no unripe fruit adjacent to the center block; if nu is greater than the threshold th, the calculation is performed according to the occupied block calculation direction Du _ z and the following formula, as in the single pushing operation in the bottom layer 9:
Figure FDA0002816176270000031
where au is a parameter for scaling the Du z norm, where au is 5 mm;
if Nu is less than the threshold Th, then the calculation uses the unoccupied block, which is calculated by the following formula:
Figure FDA0002816176270000032
where M is the intermediate vector for calculating Du _ z; the gripper moves along Du _ z and-Du _ z to push the sides of the unripe fruit apart.
5. A method of controlling a picking device based on 3D visual perception according to claims 1-4 characterized by during the third phase if the center area C in the top layer isCOccupied, the gripper can then pull the target fruit to a gripping position with less unripe fruit;
only when the central zone C of the top layerCThe dragging operation is performed when immature fruit exists. If the center block CCWhen the fruit is empty, the clamp holder directly moves upwards to grab the target fruit; in order to avoid collisions between the gripper and the table, three blocks L close to the table are skippedR、CR、RRTo calculate the drag direction, drag direction D in the xy planedrCan be determined according to the following formula:
Figure FDA0002816176270000033
wherein U isjIs the vector of the jth unoccupied block within the largest group of adjacent unoccupied blocks. The block used for the calculation is LC、LF、CF、RF、RC. The parameter m is the total number of blocks within the largest group of adjacent unoccupied blocks. DdrIs scaled to l, where l is 50 mm. Where there is generally less immature fruit present, but if all the blocks are occupied by immature fruit, the direction of drag and CFAligning; wherein the drag and push back operations move up the same height in the vertical direction.
CN202011414617.XA 2020-12-04 2020-12-04 Control method of picking device based on 3D visual perception Active CN112528826B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011414617.XA CN112528826B (en) 2020-12-04 2020-12-04 Control method of picking device based on 3D visual perception

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011414617.XA CN112528826B (en) 2020-12-04 2020-12-04 Control method of picking device based on 3D visual perception

Publications (2)

Publication Number Publication Date
CN112528826A true CN112528826A (en) 2021-03-19
CN112528826B CN112528826B (en) 2024-02-02

Family

ID=74997805

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011414617.XA Active CN112528826B (en) 2020-12-04 2020-12-04 Control method of picking device based on 3D visual perception

Country Status (1)

Country Link
CN (1) CN112528826B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109588114A (en) * 2018-12-20 2019-04-09 武汉科技大学 A kind of parallelism recognition picker system and method applied to fruit picking robot
CN110033487A (en) * 2019-02-25 2019-07-19 上海交通大学 Vegetables and fruits collecting method is blocked based on depth association perception algorithm
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device
CN109588114A (en) * 2018-12-20 2019-04-09 武汉科技大学 A kind of parallelism recognition picker system and method applied to fruit picking robot
CN110033487A (en) * 2019-02-25 2019-07-19 上海交通大学 Vegetables and fruits collecting method is blocked based on depth association perception algorithm

Also Published As

Publication number Publication date
CN112528826B (en) 2024-02-02

Similar Documents

Publication Publication Date Title
CN112136505B (en) Fruit picking sequence planning method based on visual attention selection mechanism
Lamb et al. A strawberry detection system using convolutional neural networks
Zhao et al. On-tree fruit recognition using texture properties and color data
Puttemans et al. Automated visual fruit detection for harvest estimation and robotic harvesting
CN111666883B (en) Grape picking robot target identification and fruit stalk clamping and cutting point positioning method
Li et al. Fast detection and location of longan fruits using UAV images
Ning et al. Recognition of sweet peppers and planning the robotic picking sequence in high-density orchards
Kalampokas et al. Grape stem detection using regression convolutional neural networks
CN110223349A (en) A kind of picking independent positioning method
CN107633199A (en) Apple picking robot fruit target detection method based on deep learning
Yoshida et al. Fast detection of tomato peduncle using point cloud with a harvesting robot
Palacios et al. Deep learning and computer vision for assessing the number of actual berries in commercial vineyards
CN112288809B (en) Robot grabbing detection method for multi-object complex scene
Silwal et al. Effort towards robotic apple harvesting in Washington State
CN114241037A (en) Mixed size unloading disc
Zhang et al. Greenhouse tomato detection and pose classification algorithm based on improved YOLOv5
Khokher et al. Early yield estimation in viticulture based on grapevine inflorescence detection and counting in videos
Han et al. A rapid segmentation method for weed based on CDM and ExG index
Khanal et al. Machine Vision System for Early-stage Apple Flowers and Flower Clusters Detection for Precision Thinning and Pollination
CN117830765A (en) Autonomous controllable PLC control method with image recognition and deep learning capabilities
Wang et al. A transformer-based mask R-CNN for tomato detection and segmentation
Nguyen et al. Ready or Not? A Robot-Assisted Crop Harvest Solution in Smart Agriculture Contexts
Rathore et al. A two-stage deep-learning model for detection and occlusion-based classification of kashmiri orchard apples for robotic harvesting
CN112528826B (en) Control method of picking device based on 3D visual perception
CN112544235B (en) Intelligent fruit picking robot

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant