CN113615398A

CN113615398A - Fruit stem positioning and fruit picking method, device, robot and medium

Info

Publication number: CN113615398A
Application number: CN202111179412.2A
Authority: CN
Inventors: 蔡同彪
Original assignee: Shenzhen Yuejiang Technology Co Ltd
Current assignee: Shenzhen Yuejiang Technology Co Ltd
Priority date: 2021-10-11
Filing date: 2021-10-11
Publication date: 2021-11-09
Anticipated expiration: 2041-10-11
Also published as: CN113615398B

Abstract

The embodiment of the application relates to the technical field of robots and discloses a fruit stem positioning and fruit picking method, a fruit stem positioning and fruit picking device, a robot and a medium. The fruit stalk positioning method comprises the following steps: acquiring a current image frame; detecting the position of each fruit string in the current image frame; inputting the position of each fruit cluster into a fruit stem regression network trained in advance, and determining the pixel coordinate of each fruit stem on the current image frame; calculating the depth information of each fruit stalk; determining the camera coordinate of each fruit stalk according to the depth information and the pixel coordinate of each fruit stalk; and determining the robot coordinate of each fruit stalk according to the camera coordinate of each fruit stalk. The position of each fruit cluster is input into a pre-trained fruit stem regression network, the pixel coordinate of each fruit stem on the current image frame is determined, the camera coordinate of each fruit stem is determined by combining the depth information of each fruit stem, and then the robot coordinate of each fruit stem is determined.

Description

Fruit stem positioning and fruit picking method, device, robot and medium

Technical Field

The application relates to the technical field of robots, in particular to a fruit stem positioning and fruit picking method and device based on deep learning, a robot and a medium.

Background

Currently, as the demand for fruit picking is increasing, the demand for fruit picking robots is also increasing, but because some fruits have thin skins and tender meat, for example: fruits such as grapes and cucumbers are easy to fall off and damage due to unreasonable picking points in the picking operation of the picking robot.

At present, the procedure of picking fruits by a vision-based robot can be roughly divided into two parts:

(1) positioning picking points in the image;

(2) and 3d, converting the coordinates, transmitting the 3d coordinates to the robot, and enabling the robot to walk to the corresponding position to perform corresponding picking actions. The existing method mainly comprises the steps of positioning the positions of fruit picking points in an image, positioning the positions of fruit clusters through a target detection algorithm or an image segmentation algorithm, and then positioning fruit stems, namely the positions of the picking points, by using a traditional image processing method according to the relationship between the fruit clusters and the fruit stems. The fruit bunch is positioned by adopting a deep learning mode at present, and the recognition rate is high, so that the most important difficulty is the positioning of fruit stalks.

The defects of the mode of positioning the fruit strings by adopting a target detection algorithm or an image segmentation mode and then positioning the fruit stalks by using a traditional image algorithm according to the structural characteristics of the fruits are as follows: the fruit stalk positioning method based on the structural characteristics of fruits has no universality; the traditional image processing is easily influenced by the environment, and if the environment changes sometimes, the aim of accurately positioning the fruit stalks can be achieved by adjusting algorithm parameters.

In the process of implementing the present application, the applicant finds that the existing image algorithm positioning scheme has at least the following problems: the fruit stem positioning method is not high in universality and accuracy.

Disclosure of Invention

The embodiment of the application aims to provide a fruit stem positioning and fruit picking method, device, robot and medium based on deep learning, which solve the problems of low universality and low accuracy of the existing fruit stem positioning method and improve the accuracy of fruit stem positioning.

In order to solve the above technical problem, an embodiment of the present application provides the following technical solutions:

in a first aspect, an embodiment of the present application provides a fruit stem positioning method based on deep learning, which is applied to a robot, and the method includes:

acquiring a current image frame;

detecting a position of each fruit string in the current image frame;

inputting the position of each fruit cluster into a fruit stem regression network trained in advance, and determining the pixel coordinate of each fruit stem on the current image frame;

calculating the depth information of each fruit stalk;

determining a camera coordinate of each fruit stalk according to the depth information and the pixel coordinate of each fruit stalk;

and determining the robot coordinate of each fruit stalk according to the camera coordinate of each fruit stalk.

In some embodiments, the robot includes a robot end tool and a depth camera, the method further comprising, prior to acquiring the current image frame:

calibrating the robot end tool;

and calibrating the relative positions of the robot end tool and the depth camera, and determining a calibration matrix.

In some embodiments, said detecting a location of each fruit string in the current image frame comprises:

and detecting the position of each fruit string in the current image frame based on a pre-trained target detection algorithm, and determining a target frame of each fruit string, wherein the target detection algorithm is a Yolov5 target detection algorithm.

In some embodiments, said inputting the position of each said fruit cluster into a fruit stem regression network trained in advance, and determining the pixel coordinates of each said fruit stem on said current image frame includes:

and inputting the target frame of each fruit cluster into a fruit stem regression network trained in advance, and determining the pixel coordinates of each fruit stem on the current image frame.

In some embodiments, said calculating depth information of each said fruit stem comprises:

acquiring a depth information estimation area from a target frame of each fruit string;

calculating an average value of depth values of the depth information estimation area;

and determining the depth information of each fruit stem by combining the average value of the depth values of the depth information estimation area and the offset value.

In some embodiments, the fruit stalks comprise grape stalks, the fruit clusters comprise grape clusters, and the fruit stalk regression network comprises a grape stalk regression network.

In a second aspect, the present application provides a fruit picking method based on deep learning, which is applied to a robot, the robot includes a walking mechanism, a mechanical arm and a picking box, the mechanical arm is provided with a terminal tool, the terminal tool includes an electrically controlled scissors, the method includes:

controlling the robot to automatically inspect the orchard;

acquiring a current image frame, and positioning the robot coordinates of each fruit stalk in the current image frame;

controlling the walking mechanism to move according to the robot coordinate of each fruit stalk, so that the robot moves to a picking position, and controlling the electric control scissors to cut the stalks of the fruit stalks;

and controlling the mechanical arm to put the fruit clusters with the stalks cut into the picking box.

In some embodiments, said controlling said electrically controlled shears to shear said fruit stem comprises:

after the robot moves to the picking position, controlling the mechanical arm to move to the position of the robot coordinate of the fruit stem;

and controlling the electric control scissors to cut the fruit stalks.

In some embodiments, the method further comprises:

determining the maturity of the fruit string in the current image frame according to the current image frame;

and determining whether to cut the fruit stalks or not according to the maturity of the fruit clusters.

In some embodiments, the walking mechanism comprises a mobile base, the robot further comprises a platform mounted to the mobile base, and the robotic arm is mounted to the platform, wherein the robotic arm is provided with a depth camera for acquiring image frames.

In a third aspect, an embodiment of the present application provides a fruit stem positioning device based on deep learning, which is applied to a robot, and the device includes:

the image frame acquisition unit is used for acquiring a current image frame;

a fruit string position detection unit for detecting the position of each fruit string in the current image frame;

a fruit stem pixel coordinate unit, configured to input the position of each fruit cluster to a pre-trained fruit stem regression network, and determine a pixel coordinate of each fruit stem on the current image frame;

the fruit stem depth information unit is used for calculating the depth information of each fruit stem;

the fruit stem camera coordinate unit is used for determining the camera coordinate of each fruit stem according to the depth information and the pixel coordinate of each fruit stem;

and the fruit stem robot coordinate unit is used for determining the robot coordinate of each fruit stem according to the camera coordinate of each fruit stem.

In a fourth aspect, an embodiment of the present application provides a robot controller, including:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a deep learning based fruit stem locating method according to the first aspect or a deep learning based fruit picking method according to the second aspect.

In a fifth aspect, an embodiment of the present application provides a robot, including:

a robot controller according to the fourth aspect;

the robot further includes: the robot comprises a robot body and a picking box;

wherein, the robot body includes:

a mobile base for moving the robot;

the platform is arranged on the movable base and is used for bearing the picking box;

the mechanical arm comprises a tail end tool, and the tail end tool comprises an electric control scissors and is used for shearing the fruit stalks;

the picking box is arranged on the platform and used for placing the fruit clusters after the stalks are cut.

In some embodiments, the robotic arm is provided with a depth camera for acquiring image frames, wherein the depth camera comprises a binocular camera.

In a sixth aspect, embodiments of the present application provide a computer-readable storage medium, which stores a computer program, and the computer program, when executed by a processor, implements the deep learning based fruit stem positioning method according to the first aspect or the deep learning based fruit picking method according to the second aspect.

The beneficial effects of the embodiment of the application are that: in contrast to the prior art, the fruit stem positioning method based on deep learning provided by the embodiment of the present application is applied to a robot, and the method includes: acquiring a current image frame; detecting a position of each fruit string in the current image frame; inputting the position of each fruit cluster into a fruit stem regression network trained in advance, and determining the pixel coordinate of each fruit stem on the current image frame; calculating the depth information of each fruit stalk; determining a camera coordinate of each fruit stalk according to the depth information and the pixel coordinate of each fruit stalk; and determining the robot coordinate of each fruit stalk according to the camera coordinate of each fruit stalk. The position of each fruit cluster is input into a pre-trained fruit stem regression network, the pixel coordinate of each fruit stem on the current image frame is determined, the camera coordinate of each fruit stem is determined by combining the depth information of each fruit stem, and then the robot coordinate of each fruit stem is determined.

Drawings

One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.

Fig. 1 is a schematic view of a robot provided in an embodiment of the present application;

fig. 2 is a schematic flowchart of a fruit stem positioning method based on deep learning according to an embodiment of the present application;

fig. 3 is an overall flowchart of a fruit stem positioning method based on deep learning according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a target box corresponding to a grape bunch according to an embodiment of the present disclosure;

FIG. 5 is a schematic view of grape stalks corresponding to a bunch of grapes provided by an embodiment of the present application;

FIG. 6 is a schematic diagram of a grape bunch detection algorithm provided by an embodiment of the present application;

FIG. 7 is a detailed flowchart of step S307 in FIG. 3;

FIG. 8 is a schematic diagram of a depth information estimation region of a grape stem provided by an embodiment of the present application;

FIG. 9 is a schematic diagram of a grape stem regression network locating grape stems provided by an embodiment of the present application;

fig. 10 is a schematic flow chart of a deep learning-based fruit picking method according to an embodiment of the present application;

FIG. 11 is a schematic diagram of a grape stalk locating device based on deep learning provided by an embodiment of the present application;

fig. 12 is a schematic structural diagram of a robot controller according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In addition, the technical features mentioned in the embodiments of the present application described below may be combined with each other as long as they do not conflict with each other.

Referring to fig. 1, fig. 1 is a schematic view of a robot according to an embodiment of the present disclosure;

as shown in fig. 1, the robot 100 includes a robot body 10 and a picking box 20, wherein the picking box 20 is fixedly mounted on the robot body 10.

Wherein, this robot body 10 includes: a mobile base 11, a platform 12 mounted on the mobile base, and a robot arm 13.

In particular, the mobile base 11 is used to move said robot, for example: the mobile base 11 includes four steering wheels, each of which can be controlled by the robot controller to rotate, for controlling the moving direction of the robot.

In particular, the platform 12, mounted to the mobile base 11, is intended to carry the picking chamber 20 and the platform 12 is also intended to carry the robotized arm 13, for example: the mechanical arm 13 is fixedly arranged on the platform 12, and the picking box 20 is detachably and fixedly arranged on the platform 12.

Specifically, the robot arm 13 is provided with a depth camera 131, and the depth camera 131 is mounted at the distal end of the robot arm and moves along with the movement of the robot arm to acquire image frames and can acquire image depth information.

Specifically, the tail end of the mechanical arm 10 is further provided with a tail end tool, the tail end tool comprises an electric control scissors, the electric control scissors are used for picking grapes, namely, the stems are cut, and specifically, fruit clusters corresponding to the stems are picked according to the determined robot coordinates of the stems. Wherein, this automatically controlled scissors still includes the clamping jaw, for example: the two clamping jaws are used for clamping the fruit bunch corresponding to the fruit stalk.

Wherein the picking box 20 is mounted to the platform 12, for example: removably mounted to said platform 12, for placing the fruit bunch after being cut, for example: the grape bunch.

Referring to fig. 2 again, fig. 2 is a schematic flow chart of a fruit stem positioning method based on deep learning according to an embodiment of the present application;

it can be understood that the fruit stem positioning method based on deep learning in the embodiment of the present application is applicable to fruits including fruit stems and fruit clusters, for example: grape, cucumber, grape, rambutan, goldthread fruit, cape gooseberry, custard apple, fig, etc., without limitation.

As shown in fig. 2, the fruit stem positioning method based on deep learning includes:

step S201: acquiring a current image frame;

specifically, a depth camera set by the robot acquires a current image frame, where the current image frame includes information of at least one fruit string, such as: the bunch of grapes includes a bunch of grapes, and the current image frame includes information of at least one bunch of grapes.

Step S202: detecting a position of each fruit string in the current image frame;

specifically, the detecting the position of each fruit string in the current image frame includes:

Step S203: inputting the position of each fruit cluster into a fruit stem regression network trained in advance, and determining the pixel coordinate of each fruit stem on the current image frame;

specifically, the target frame of each fruit cluster is input into a fruit stem regression network trained in advance, and the pixel coordinates of each fruit stem on the current image frame are determined.

Step S204: calculating the depth information of each fruit stalk;

specifically, the calculating the depth information of each fruit stalk includes:

and estimating the depth value of the area according to the depth information, and determining the depth information of each fruit stem by combining a preset offset value.

Step S205: determining a camera coordinate of each fruit stalk according to the depth information and the pixel coordinate of each fruit stalk;

specifically, the determining the camera coordinate of each fruit stalk according to the depth information and the pixel coordinate of each fruit stalk includes:

suppose the depth information of fruit stalks is

The pixel coordinate of the fruit stem is

The camera coordinates are

Then, then

Wherein the content of the first and second substances,

is the internal reference matrix of the depth camera.

Step S206: and determining the robot coordinate of each fruit stalk according to the camera coordinate of each fruit stalk.

Specifically, the determining the robot coordinate of each fruit stalk according to the camera coordinate of each fruit stalk includes:

assuming the robot coordinates of the fruit stalks as

Then, then

Wherein the content of the first and second substances,

is a calibration matrix.

In an embodiment of the application, the robot comprises a robot end tool and a depth camera, the method further comprising, before acquiring the current image frame:

calibrating the robot end tool;

Specifically, please refer to fig. 3 again, fig. 3 is a schematic overall flow chart of a fruit stem positioning method based on deep learning according to an embodiment of the present application;

as shown in fig. 3, the overall process of the fruit stem positioning method based on deep learning includes:

step S301: calibrating a robot end tool;

specifically, a Tool coordinate system is established on the robot Tool through TCP (Tool Center Point) Tool coordinate system calibration, an origin of the Tool coordinate system is a Tool Center Point (TCP), and the calibration process includes the following steps:

(1) placing a fixed point in the working space of the robot;

(2) the TCP is overlapped with a fixed point in the space by controlling the posture of the robot;

(3) repeating the step for 3 times, and changing the posture of the robot to move the TCP to the same point;

(4) and establishing an equation set and solving on the condition that coordinates of the four TCP points in a world coordinate system are equal, so that the position of the tool coordinate system is calibrated.

Step S302: calibrating the relative positions of the robot end tool and the depth camera, and determining a calibration matrix;

specifically, the calibration matrix is a hand-eye calibration matrix, the hand-eye calibration matrix is obtained by moving the tail end of the mechanical arm for multiple times, so that the depth camera collects multiple times of calibration plate information, wherein the mechanical arm base and the calibration plate are fixed, a conversion matrix of a mechanical arm base coordinate system and a mechanical arm tail end coordinate system is known, coordinates of the calibration plate under the camera coordinate system are known, and the conversion matrix of the mechanical arm tail end coordinate system and the camera coordinate system is obtained by solving, so that the conversion matrix of the camera coordinate system and the mechanical arm base coordinate system, namely the calibration matrix, is obtained.

Step S303: acquiring image data, and training a target detection network and a fruit stem regression network;

specifically, the target detection network is a YOLOv5 target detection network, where the YOLOv5 target detection network inputs image frames and outputs target frames of all fruit strings in the image frames, for example: the fruit stem regression network is a grape stem regression network, the YOLOv5 target detection network inputs image frames containing grape bunch information and outputs target frames of all grape bunches in the image frames.

Referring to fig. 4 again, fig. 4 is a schematic view of a target frame corresponding to a grape bunch according to an embodiment of the present application;

as shown in fig. 4, the YOLOv5 target detection network was trained as a true value by artificially labeling the target box corresponding to each grape string in the image frame.

Specifically, the input of the grape stem regression network is a target frame corresponding to the grape bunch, that is, an image corresponding to the target frame, the image corresponding to the target frame is a cut-out image on the original image, and the output is a pixel coordinate of the grape stem corresponding to the grape bunch, that is, a pixel coordinate of the picking point.

It can be understood that, in order to train the YOLOv5 target detection network better, a large number of images are required, and therefore, the embodiment of the present application also increases the number of images by preprocessing the collected images. Specifically, the training of the target detection network includes the following steps:

(1) preprocessing an original image to obtain a preprocessed image;

specifically, the original images including the grape bunch are collected from the internet, surveillance videos and other channels, or synthesized manually, and the resolution of the original images is adjusted, for example: the resolution of the original image is adjusted to 720 x 540. The method comprises the steps of collecting the grape bunch from the internet, monitoring videos and other channels, or artificially synthesizing original images containing the grape bunch, and processing the collected original images to increase the sample size. Among them, since there are many parameters of the YOLOv5 target detection network, training on relatively few images is easy to overfit. To reduce the risk of over-fitting, the processing of the collected raw images to increase the sample size by increasing the number of images comprises: performing operations such as scaling and turning on the collected image, for example: randomly cutting a plurality of small images in one image, and randomly turning horizontally, wherein the randomly cutting means turning the image up and down. And the images are amplified through random cropping and random horizontal turnover, so that the number of the images is increased, and the overfitting risk is reduced.

(2) Manufacturing a training sample;

specifically, the training sample preparation includes: and manually labeling the preprocessed image, labeling rectangular frames of all grape strings contained in the preprocessed image, and labeling the rectangular frames to identify the grape strings. And manually marking out a circumscribed rectangular frame of the grape bunch in the image so as to obtain a training sample.

(3) Training a target detection network;

specifically, the training target detection network includes: and training the target detection network based on the training sample to obtain the trained weight so as to obtain the trained target detection network.

Referring to fig. 5, fig. 5 is a schematic view of a grape stem corresponding to a grape bunch according to an embodiment of the present application;

as shown in fig. 5, the positions of the grape stalks are artificially marked on the image corresponding to the target frame, and the positions are used as true values to train a grape stalk regression network.

In the embodiment of the present application, the training process of the grape stem regression network is similar to the training process of the target detection network, and reference may be made to the training process of the grape stem regression network, which is not described herein again.

Step S304: acquiring a current image frame;

specifically, a depth camera set by the robot acquires a current image frame, wherein the current image frame comprises information of at least one grape bunch.

Step S305: detecting the position of each grape bunch in the current image frame, and determining a target frame of each grape bunch;

specifically, the detecting the position of each grape bunch in the current image frame includes:

detecting the position of each grape bunch in the current image frame based on a pre-trained target detection algorithm, and determining a target frame of each grape bunch, wherein the target detection algorithm is a Yolov5 target detection algorithm.

It is understood that the position of the grape bunch in the current image frame refers to the position of the grape bunch in the current image frame, the position is given by two coordinate points, the position of the grape bunch is determined by the positions of the two coordinate points, wherein the two coordinate points determine a minimum bounding rectangle, and the position of the rectangle is determined by the coordinates of the upper left corner and the lower right corner of the rectangle, namely, the two coordinate points are the coordinates of the upper left corner and the lower right corner of the rectangle respectively.

Referring to fig. 6, fig. 6 is a schematic diagram illustrating a grape bunch detected by the target detection algorithm according to the embodiment of the present application;

as shown in fig. 6, each bunch of grapes in the current image frame is detected by the target detection algorithm, and the position of each bunch of grapes is determined by the target frame. Specifically, the detection obtains the center point coordinates, width and height of the target frame, and a class probability, i.e. a category probability, wherein the interval of the class probability is [0,1 ].

Step S306: obtaining a target frame of a grape bunch;

specifically, one of the target frames detected by the target detection algorithm is selected, for example: the target frame is sequentially selected from the current image frame in the order of left to right, top to bottom of the current image frame.

Step S307: calculating the depth information of the grape stalks corresponding to the grape bunch;

specifically, please refer to fig. 7 again, fig. 7 is a schematic diagram of a detailed flow of step S307 in fig. 3;

as shown in fig. 7, the step S307: the calculating of the depth information of the fruit stalks corresponding to the fruit strings includes:

step S3071: acquiring a depth information estimation area from a target frame of each fruit string;

specifically, the depth information estimation region is a central region in a target frame, and the central region in the target frame is determined as the depth information estimation region by a preset rule, where the preset rule is used to determine the size and the position of the central region, for example: and determining the position of the central point and the width and the height of the depth information estimation area.

Step S3072: calculating an average value of depth values of the depth information estimation area;

specifically, the depth value is a z value in a camera coordinate system, and an average value of a plurality of depth values is obtained by determining a plurality of depth values in the depth information estimation area and averaging the plurality of depth values. It is to be understood that the depth information estimation region includes a plurality of pixel points, and the plurality of depth values are determined by determining a depth value corresponding to each pixel point.

Step S3073: and determining the depth information of each fruit stem by combining the average value of the depth values of the depth information estimation area and the offset value.

Specifically, the depth information of each fruit stem = an average value of depth values of the depth information estimation area + an offset value, where the offset value may be preset or obtained through calculation, for example: calculating the offset value according to the width of the target frame of the grape bunch corresponding to the fruit stalk, wherein the offset value is positively correlated with the width of the target frame of the grape bunch corresponding to the fruit stalk, such as: the offset value = width of a target frame of the grape bunch corresponding to the grape stalk × (proportional coefficient), wherein a value range of the proportional coefficient is (0, 1), and preferably, a value range of the proportional coefficient is (0.2, 0.5).

It can be understood that if the fruit stalks are grape stalks, the depth information of the grape stalks may not be detected by the depth camera because the grape stalks are finer, and therefore, the depth information of the grape stalks is estimated by adding a certain offset value to the depth information of the middle part of the grape bunch.

Referring to fig. 8, fig. 8 is a schematic diagram of a depth information estimation region of a grape stalk according to an embodiment of the present application;

as shown in fig. 8, a small frame region, i.e., a depth information estimation region, is arranged inside the target frame corresponding to each grape bunch, all depth information in the depth information estimation region is taken, then an average value is taken, and finally the average value + an offset value is taken as the depth information of the grape stalk.

Step S308: calculating pixel coordinates of fruit stalks corresponding to the fruit strings;

specifically, the position of each fruit cluster is input into a fruit stem regression network trained in advance, and the pixel coordinates of each fruit stem on the current image frame are determined, for example: inputting the position of each grape bunch into a pre-trained grape stem regression network, and determining the pixel coordinates of each grape stem on the current image frame, wherein the method specifically comprises the following steps: and inputting the target frame of each grape bunch into a pre-trained grape stem regression network, and determining the pixel coordinates of each grape stem on the current image frame.

Referring to fig. 9, fig. 9 is a schematic diagram of positioning grape stalks by a grape stalk regression network according to an embodiment of the present application;

as shown in fig. 9, the position of the grape stem corresponding to the grape bunch is located by inputting the target box to the grape stem regression network.

It can be understood that, in order to avoid that no grape stalk exists in the target frame of the grape bunch, the present application further expands the target frame of the grape bunch to include the grape stalk in the target frame, for example: and (3) externally expanding the area of the target frame, wherein the externally expanding coefficient is more than 1.

Step S309: calculating camera coordinates of the fruit stalks;

specifically, determining the camera coordinate of each fruit stalk according to the depth information and the pixel coordinate of each fruit stalk specifically includes:

suppose the depth information of fruit stalks is

The pixel coordinate of the fruit stem is

The camera coordinates are

Then, then

Wherein the content of the first and second substances,

is the internal reference matrix of the depth camera.

Step S310: calculating the robot coordinates of the fruit stalks;

specifically, determining the robot coordinate of each fruit stalk according to the camera coordinate of each fruit stalk specifically comprises:

assuming the robot coordinates of the fruit stalks as

Then, then

Wherein the content of the first and second substances,

is a calibration matrix, i.e. an external parameter matrix, including a rotation matrix

And translation matrix

。

Step S311: judging whether all the target frames are traversed or not;

specifically, after the robot coordinates of the fruit stalks corresponding to the current target frame are calculated, the next target frame in the current image frame is selected, the robot coordinates of the fruit stalks corresponding to the next target frame are calculated, and the process is repeated until all the target frames in the current image frame are traversed.

Finishing;

in the embodiment of the application, a fruit stem positioning method based on deep learning is provided and applied to a robot, and the method comprises the following steps: acquiring a current image frame; detecting a position of each fruit string in the current image frame; inputting the position of each fruit cluster into a fruit stem regression network trained in advance, and determining the pixel coordinate of each fruit stem on the current image frame; calculating the depth information of each fruit stalk; determining a camera coordinate of each fruit stalk according to the depth information and the pixel coordinate of each fruit stalk; and determining the robot coordinate of each fruit stalk according to the camera coordinate of each fruit stalk. The position of each fruit cluster is input into a pre-trained fruit stem regression network, the pixel coordinate of each fruit stem on the current image frame is determined, the camera coordinate of each fruit stem is determined by combining the depth information of each fruit stem, and then the robot coordinate of each fruit stem is determined.

Referring to fig. 10, fig. 10 is a schematic flow chart of a deep learning-based fruit picking method according to an embodiment of the present application;

the fruit picking method based on deep learning is applied to a robot, the robot comprises a walking mechanism, a mechanical arm and a picking box, the mechanical arm is provided with a tail end tool, and the tail end tool comprises electric control scissors.

The robot comprises a walking mechanism, a robot arm and a platform, wherein the walking mechanism comprises a movable base, the robot further comprises a platform arranged on the movable base, the robot arm is arranged on the platform, and the robot arm is provided with a depth camera and used for acquiring image frames.

As shown in fig. 10, the fruit picking method based on deep learning includes:

step S101: controlling the robot to automatically inspect the orchard;

specifically, the robot comprises a walking mechanism and sends a walking instruction to the robot so as to control the walking mechanism of the robot to automatically patrol in the orchard according to a preset route.

Step S102: acquiring a current image frame, and positioning the robot coordinates of each fruit stalk in the current image frame;

specifically, after the current image frame is obtained in real time, the positioning of the robot coordinate of each fruit stalk in the current image frame includes:

acquiring a current image frame;

detecting a position of each fruit string in the current image frame;

calculating the depth information of each fruit stalk;

It can be understood that, for the step of locating the robot coordinates of the fruit stalks, reference may be made to the relevant contents of the fruit stalk locating method based on deep learning mentioned in the above embodiments, and details are not described herein again.

Step S103: controlling the walking mechanism to move according to the robot coordinate of each fruit stalk, so that the robot moves to a picking position, and controlling the electric control scissors to cut the stalks of the fruit stalks;

specifically, after the robot coordinates of any fruit stem are obtained through calculation, the walking mechanism is controlled to move, so that the robot moves to a picking position, where the picking position is a suitable position where the mechanical arm of the robot can move to the robot coordinates of the fruit stem, for example: the front of the fruit bunch corresponding to the fruit stem enables the robot to conveniently pick the fruit bunch.

Specifically, the controlling the electric control scissors to cut the fruit stalks comprises:

after the robot moves to the picking position, controlling the mechanical arm to move to the position of the robot coordinate of the fruit stem; and controlling the electric control scissors to cut the fruit stalks.

For example: the fruit stalks are grape stalks, and the electric control scissors are controlled to cut the stalks of the grapes according to the robot coordinates of the grape stalks;

specifically, according to the robot coordinates of the grape stalks, the robot is controlled to move to the front of the grape stalks, the mechanical arms of the robot are controlled to move to the front of the grape stalks, the mechanical arms are further controlled to move to the positions of the robot coordinates of the grape stalks, and the grape stalks are cut by controlling the electric control scissors of the mechanical arms.

Step S104: and controlling the mechanical arm to put the fruit clusters with the stalks cut into the picking box.

Specifically, after the stalks are cut, the clamping jaws of the electric control scissors of the mechanical arm are controlled to clamp the fruit strings after the stalks are cut, and the mechanical arm is controlled to put the fruit strings after the stalks are cut into the picking box, for example: after the grape stalks are cut, the clamping jaws of the electric control scissors of the mechanical arm are controlled to clamp the grape bunch subjected to stalk cutting, and the mechanical arm is controlled to put the grape bunch subjected to stalk cutting into the picking box.

In an embodiment of the present application, the method further includes:

Specifically, according to the current image frame, the maturity of each fruit string in the current image frame is calculated, for example: the method comprises the steps of obtaining color space RGB numerical values of regions corresponding to fruit strings in a current image frame, determining the maturity of each fruit string based on a preset RGB numerical value and maturity corresponding relation table, and determining whether the fruit strings are mature or not based on a preset maturity threshold. If the maturity of a certain fruit string is greater than or equal to a preset maturity threshold, determining that the fruit string is mature, and at the moment, determining to cut the stalks corresponding to the fruit string; if the maturity of a certain fruit string is smaller than a preset maturity threshold, determining that the fruit string is immature, and at the moment, determining that the stalk cutting is not performed on the stalk corresponding to the fruit string.

It can be understood that the corresponding maturity threshold values of each kind of fruit are different, and therefore, the corresponding relation table of the preset RGB values and the maturity corresponding to each kind of fruit is established in the present application, so as to accurately determine whether the fruit string corresponding to the fruit is mature, and determine whether to cut the stem of the fruit corresponding to the fruit string.

Through adopting the mode of degree of depth study to replace the fruit stalk location scheme based on traditional image processing, the fruit stalk location method based on traditional image processing's not high, the problem that easily receives the environmental impact has been solved in this application to this scheme combines with the robot, has constituted the complete flow that some visual location, robot were picked to complete grape, has the practicality, can carry out fruit picking better.

In the embodiment of the application, the fruit picking method based on deep learning is applied to a robot, the robot comprises a walking mechanism, a mechanical arm and a picking box, the mechanical arm is provided with a tail end tool, the tail end tool comprises an electric control scissors, and the method comprises the following steps: controlling the robot to automatically inspect the orchard; acquiring a current image frame, and positioning the robot coordinates of each fruit stalk in the current image frame; controlling the walking mechanism to move according to the robot coordinate of each fruit stalk, so that the robot moves to a picking position, and controlling the electric control scissors to cut the stalks of the fruit stalks; and controlling the mechanical arm to put the fruit clusters with the stalks cut into the picking box. Through the robot coordinate of location fruit stalk, the automatically controlled scissors of control is cut the stalk and will cut the fruit cluster after the stalk and put in the picking case, and this application can improve the picking efficiency of grape.

Referring to fig. 11, fig. 11 is a schematic view of a fruit stem positioning device based on deep learning according to an embodiment of the present disclosure; the fruit stalk positioning device based on deep learning can be applied to robots, such as: a picking robot.

As shown in fig. 11, the fruit stem positioning device 110 based on deep learning includes:

an image frame acquisition unit 1101 for acquiring a current image frame;

a fruit cluster position detecting unit 1102 for detecting the position of each fruit cluster in the current image frame;

a fruit stem pixel coordinate unit 1103, configured to input the position of each fruit cluster to a pre-trained fruit stem regression network, and determine a pixel coordinate of each fruit stem on the current image frame;

a fruit stem depth information unit 1104 for calculating depth information of each of the fruit stems;

a fruit stem camera coordinate unit 1105, configured to determine a camera coordinate of each fruit stem according to the depth information and the pixel coordinate of each fruit stem;

a fruit stem robot coordinate unit 1106, configured to determine a robot coordinate of each fruit stem according to the camera coordinate of each fruit stem.

In an embodiment of the application, the robot comprises a robot end tool and a depth camera, the apparatus further comprises:

an end tool calibration unit (not shown) for calibrating the robot end tool;

a calibration matrix unit (not shown) for calibrating the relative positions of the robot end-tool and the depth camera, determining a calibration matrix.

In this embodiment of the present application, the fruit string position detecting unit 1102 is specifically configured to:

In this embodiment of the present application, the fruit stem pixel coordinate unit 1103 is specifically configured to:

In this embodiment of the present application, the fruit stem depth information unit 1104 is specifically configured to:

and determining the depth information of each fruit stem according to the average value of the depth values of the depth information estimation area and a preset offset value.

In this embodiment of the present application, the fruit stem camera coordinate unit 1105 is specifically configured to:

suppose the depth information of fruit stalks is

The pixel coordinate of the fruit stem is

The camera coordinates are

Then, then

Wherein the content of the first and second substances,

is the internal reference matrix of the depth camera.

In this embodiment of the application, the fruit stalk robot coordinate unit 1106 is specifically configured to:

assuming the robot coordinates of the fruit stalks as

Then, then

Wherein the content of the first and second substances,

is a calibration matrix.

Since the device embodiment and the method embodiment are based on the same concept, on the premise that the contents do not conflict with each other, the contents of the device embodiment may refer to the method embodiment, please refer to the above embodiment of the fruit stem positioning method based on deep learning, which is not described herein again.

In the embodiment of the application, by providing a fruit stalk positioner based on deep learning, be applied to the robot, the device includes: the image frame acquisition unit is used for acquiring a current image frame; a fruit string position detection unit for detecting the position of each fruit string in the current image frame; a fruit stem pixel coordinate unit, configured to input the position of each fruit cluster to a pre-trained fruit stem regression network, and determine a pixel coordinate of each fruit stem on the current image frame; the fruit stem depth information unit is used for calculating the depth information of each fruit stem; the fruit stem camera coordinate unit is used for determining the camera coordinate of each fruit stem according to the depth information and the pixel coordinate of each fruit stem; and the fruit stem robot coordinate unit is used for determining the robot coordinate of each fruit stem according to the camera coordinate of each fruit stem. The position of each fruit cluster is input into a pre-trained fruit stem regression network, the pixel coordinate of each fruit stem on the current image frame is determined, the camera coordinate of each fruit stem is determined by combining the depth information of each fruit stem, and then the robot coordinate of each fruit stem is determined.

Referring to fig. 12, fig. 12 is a schematic structural diagram of a robot controller according to an embodiment of the present disclosure. Among them, the robot controller is applied to a robot, such as: a picking robot.

In the embodiment of the present application, the robot controller includes, but is not limited to, a television, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, and an electronic device such as a pedometer.

As shown in fig. 12, the robot controller 120 includes one or more processors 1201 and a memory 1202. Fig. 12 illustrates an example of one processor 1201.

The processor 1201 and the memory 1202 may be connected by a bus or other means, and fig. 12 illustrates an example of a connection by a bus.

A processor 1201 for obtaining a current image frame; detecting the position of each fruit string in the current image frame; inputting the position of each fruit cluster into a fruit stem regression network trained in advance, and determining the pixel coordinate of each fruit stem on the current image frame; calculating the depth information of each fruit stalk; determining the camera coordinate of each fruit stalk according to the depth information and the pixel coordinate of each fruit stalk; and determining the robot coordinate of each fruit stalk according to the camera coordinate of each fruit stalk.

The processor 1201 is further used for controlling the robot to automatically inspect the orchard; acquiring a current image frame, and positioning the robot coordinates of each fruit stalk in the current image frame; controlling a walking mechanism to move according to the robot coordinate of each fruit stalk, so that the robot moves to a picking position, and controlling an electric control shear to shear the stalks of the fruit stalks; and controlling the mechanical arm to put the fruit clusters with the stems cut into the picking box.

In the embodiment of the application, the position of each fruit cluster is input into a fruit stem regression network trained in advance, the pixel coordinate of each fruit stem on the current image frame is determined, the camera coordinate of each fruit stem is determined by combining the depth information of each fruit stem, and then the robot coordinate of each fruit stem is determined.

The memory 1202 is a non-volatile computer-readable storage medium, and can be used for storing non-volatile software programs, non-volatile computer-executable programs, and modules, such as units (e.g., units described in fig. 11) corresponding to a deep learning based fruit stem positioning method in the embodiments of the present application. The processor 1201 executes various functional applications and data processing of the deep learning based fruit stem localization method or the deep learning based fruit picking method by running the non-volatile software programs, instructions and modules stored in the memory 1202, i.e., realizes the functions of the deep learning based fruit stem localization method or the deep learning based fruit picking method in the above method embodiments and the various modules and units of the above device embodiments.

The memory 1202 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 1202 may optionally include memory located remotely from the processor 1201, which may be coupled to the processor 1201 through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The modules are stored in the memory 1202 and, when executed by the one or more processors 1201, perform the steps of the deep learning based stem localization method or the deep learning based fruit picking method of any of the above method embodiments.

Embodiments also provide a non-transitory computer storage medium storing computer-executable instructions, which are executed by one or more processors, such as one of the processors 1201 in fig. 12, and the one or more processors may execute the deep learning based fruit stem positioning method or the deep learning based fruit picking method in any of the above method embodiments. The non-volatile computer storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above-described embodiments of the apparatus or device are merely illustrative, wherein the unit modules described as separate parts may or may not be physically separate, and the parts displayed as module units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network module units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a general hardware platform, and certainly can also be implemented by hardware. Based on such understanding, the technical solutions mentioned above may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a terminal device, or a network device) to execute the method according to each embodiment or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; within the context of the present application, where technical features in the above embodiments or in different embodiments can also be combined, the steps can be implemented in any order and there are many other variations of the different aspects of the present application as described above, which are not provided in detail for the sake of brevity; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims

1. A fruit stem positioning method based on deep learning is characterized by being applied to a robot and comprising the following steps:

acquiring a current image frame;

detecting a position of each fruit string in the current image frame;

calculating the depth information of each fruit stalk;

2. The method of claim 1, wherein the robot includes a robot end tool and a depth camera, and prior to acquiring the current image frame, the method further comprises:

calibrating the robot end tool;

3. The method of claim 2, wherein said detecting a location of each fruit string in the current image frame comprises:

4. The method of claim 3, wherein said inputting the location of each said fruit cluster into a pre-trained fruit stem regression network, determining the pixel coordinates of each said fruit stem on said current image frame, comprises:

5. The method of claim 3, wherein said calculating depth information of each of said fruit stalks comprises:

6. The method of any one of claims 1-5, wherein the fruit stalks comprise grape stalks, the fruit clusters comprise grape clusters, and the fruit stalk regression network comprises a grape stalk regression network.

7. A fruit picking method based on deep learning is characterized by being applied to a robot, the robot comprises a walking mechanism, a mechanical arm and a picking box, the mechanical arm is provided with a tail end tool, the tail end tool comprises an electric control scissors, and the method comprises the following steps:

controlling the robot to automatically inspect the orchard;

8. The method of claim 7, wherein said controlling said electrically controlled shears to shear said fruit stalks comprises:

and controlling the electric control scissors to cut the fruit stalks.

9. The method according to claim 7 or 8, characterized in that the method further comprises:

10. The method according to claim 7 or 8, wherein the walking mechanism comprises a mobile base, the robot further comprising a platform mounted to the mobile base, the robot arm being mounted to the platform, wherein the robot arm is provided with a depth camera for acquiring image frames.

11. A fruit stalk positioner based on degree of deep learning, its characterized in that is applied to the robot, the device includes:

the image frame acquisition unit is used for acquiring a current image frame;

12. A robot controller, comprising:

at least one processor; and the number of the first and second groups,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the deep learning based fruit stem localization method of any one of claims 1-6 or the deep learning based fruit picking method of any one of claims 7-10.

13. A robot, characterized in that the robot comprises:

the robot controller of claim 12;

the robot further includes: the robot comprises a robot body and a picking box;

wherein, the robot body includes:

a mobile base for moving the robot;

14. The robot of claim 13, wherein the robotic arm is provided with a depth camera for acquiring image frames, wherein the depth camera comprises a binocular camera.

15. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements a deep learning based fruit stem localization method according to any one of claims 1 to 6 or a deep learning based fruit picking method according to any one of claims 7 to 10.