CN114700941B

CN114700941B - Strawberry picking method based on binocular vision and robot system

Info

Publication number: CN114700941B
Application number: CN202210311178.2A
Authority: CN
Inventors: 陈鹏; 许浪; 章军; 夏懿; 王儒敬; 王刘向; 牛子寒; 陈建峰; 黄琼娇; 路宝榕; 胡涛
Original assignee: Hefei Intelligent Agriculture Collaborative Innovation Research Institute Of China Science And Technology
Current assignee: Hefei Intelligent Agriculture Collaborative Innovation Research Institute Of China Science And Technology
Priority date: 2022-03-28
Filing date: 2022-03-28
Publication date: 2024-02-27
Anticipated expiration: 2042-03-28
Also published as: CN114700941A

Abstract

The invention provides a binocular vision-based strawberry picking method and a robot system, which are characterized in that a strawberry image is collected and preprocessed to obtain a preprocessed image; carrying out strawberry fruit identification on the preprocessed image according to a preset identification model; determining the lowest point of the strawberry and the mass center of the strawberry according to the strawberry fruit identification result; determining picking point positions according to the lowest point and the mass center; the picking point is a position on the strawberry stems at a preset distance from the calyx. According to the invention, the barycenter and the stems of the strawberries are utilized to determine the picking points, so that the most possible stems can be selected from a large number of stems in the image, and the strawberries can be picked with low damage efficiently.

Description

Strawberry picking method based on binocular vision and robot system

Technical Field

The invention relates to the technical field of machine vision, in particular to a binocular vision-based strawberry picking method and a robot system.

Background

Mature greenhouse fruits are commonly picked manually at present, along with increasing of labor cost, in order to reduce labor intensity of people, improve efficiency of picking fruits and vegetables in the greenhouse and ensure quality of picked fruits and vegetables, people are urgent in demand for picking technology.

Unlike other crops such as apples, oranges, etc., strawberry fruits are soft fruits and the fruits are shaped differently, which results in a certain difficulty in the identification process of the fruits. The strawberry fruits need to be harvested when ripe and soft, which means careful handling, the crushing easily causes damage to the fruits, and the rot spreads, which is not only damage to one fruit, but also causes extensive damage to crops. Thus, the challenges presented by soft fruits such as strawberries mean that existing harvesting methods will not work. The technology aims to solve the problem of designing a strawberry picking machine which can replace manual strawberry picking and protect fruits in the picking process. With the rapid development of deep learning in recent years, a target detection model mainly represented by yolo series has been vigorously developed, wherein YOLOv5 is extremely widely used in practical engineering applications.

Whereas prior art methods are difficult to implement for picking methods on top of strawberry stems, the mechanical ends are easily brought into contact with the fruit, thereby damaging the strawberry fruit.

Disclosure of Invention

In view of the above, the invention provides a binocular vision-based strawberry picking method and a robot system, which can select most likely stems from a large number of stems in an image by determining picking points by utilizing the mass center and the stems of strawberries, and efficiently pick the strawberries with low damage.

The technical scheme of the invention is as follows:

a binocular vision-based strawberry picking method, comprising:

acquiring a strawberry image, and preprocessing the strawberry image to obtain a preprocessed image;

carrying out strawberry fruit identification on the preprocessed image according to a preset identification model; determining the lowest point of the strawberry and the mass center of the strawberry according to the strawberry fruit identification result;

determining picking point positions according to the lowest point and the mass center; the picking point is a position on the strawberry stems at a preset distance from the calyx.

Preferably, the lowest point of the strawberry and the mass center of the strawberry are determined according to the strawberry fruit identification result, and the mass center of the strawberryThe method is obtained by the following formula:

where f (x, y) represents the mass of each particle; the coordinate points (x, y) are strawberry pixel points of the strawberry fruit recognition result image; the f (x, y) is 0 or 1;

when the pixel point of the coordinate point (x, y) is in a red area, representing that f (x, y) returns to 1 on the strawberry fruit;

when the coordinate points (x, y) obtain that the pixel points are not in the red area, representing that the pixel points are not on strawberry fruits, f (x, y) returns to 0;

the centroid of the whole strawberry can be calculated by weighted summation according to the formula.Representing a weighted average of the x-coordinate directions, +.>Weighting representing the y-coordinateAverage number;

further, a gradient of a line from the lowest point to the centroid is determined, a picking point is determined on the line, and then a best matching point is determined as a final picking point position according to a gray-scale based template matching method.

Preferably, the determining the best matching point as the final picking point position according to the gray-scale-based template matching method includes:

firstly, selecting a region of interest as a template to generate a gray value template;

then, carrying out rough matching on the detection image and the template image, selecting one point in the detection image and the template image, and calculating the similarity of gray scales of the detection image and the template image in a point-separating searching mode, so that rough matching is carried out once to obtain rough relevant points;

and then carrying out fine matching, taking the obtained rough correlation point as a central point, and searching an optimal matching point between the rough correlation point and the central point by using a least square method so as to serve as a final picking point position.

Preferably, the strawberry fruit recognition is performed on the preprocessed image according to a preset recognition model, including: the preset recognition model is based on a YOLOv5 network structure;

in the recognition model based on the YOLOv5 network structure, strawberry images are firstly aggregated on different image fine granularity through a backbone network to form image features, then the image features are mixed and combined through a Neck network, the image features are transferred to a Prediction layer, and finally the image features are predicted to generate a boundary frame and a Prediction result.

In addition, a strawberry picking robot system based on binocular vision is also provided, wherein the robot comprises a binocular vision module and an end execution module; the binocular vision module is used for collecting strawberry images; the end execution module is used for picking strawberries;

the system further comprises:

the preprocessing module is used for preprocessing the strawberry image to obtain a preprocessed image;

the identification module is used for identifying the strawberry fruits of the preprocessed image according to a preset identification model; determining the lowest point of the strawberry and the mass center of the strawberry according to the strawberry fruit identification result;

the picking position determining module is used for determining the picking point position according to the lowest point and the mass center; the picking point is a position on the strawberry stems at a preset distance from the calyx.

the centroid of the whole strawberry can be calculated by weighted summation according to the formula.Representing a weighted average of the x-coordinate directions, +.>Representing a weighted average of the y coordinates;

a gradient of a line from the lowest point to the centroid is determined, a picking point is determined on the line, and then a best matching point is determined as a final picking point position according to a gray-scale based template matching method.

In the scheme of the invention, a binocular vision-based strawberry picking method and a robot system acquire strawberry images and preprocess the strawberry images to obtain preprocessed images; carrying out strawberry fruit identification on the preprocessed image according to a preset identification model; determining the lowest point of the strawberry and the mass center of the strawberry according to the strawberry fruit identification result; determining picking point positions according to the lowest point and the mass center; the position of the picking point is a position which is at a preset distance from the calyx on the strawberry stem, so that the identification precision of the strawberry and the positioning and distance measurement of the strawberry are improved well, the contact of the mechanical end to the strawberry fruit is effectively avoided, and the damage to the strawberry fruit is avoided.

Drawings

Fig. 1 is a flowchart of a binocular vision-based strawberry picking method in an embodiment of the present invention;

FIG. 2 is a practical illustration of picking points found by using a template matching method in the embodiment of the invention;

FIG. 3 is a schematic diagram of a robotic end effector in accordance with an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As shown in fig. 1, the invention implements a binocular vision-based strawberry picking method, which comprises the following steps:

s1, acquiring a strawberry image, and preprocessing the strawberry image to obtain a preprocessed image.

Specifically, in this embodiment, the strawberry image captured by the binocular camera may have noise. The method is used for filtering noise in the image, so that the sharpening degree of the image can be improved, the characteristics of branches, leaves and fruits of the strawberry can be more obvious, the recognition degree is improved, and the recognition difficulty of the strawberry is reduced. In the fruit picking robot vision system, the collected fruit images are first preprocessed. Because of uncertainty of fruit image quality, the training result of the network and the effects of positioning detection and classification recognition can be greatly influenced by directly inputting the image into the built convolutional neural network, so that the fruit image needs to be preprocessed. Different image preprocessing methods are typically employed for different vision systems and for different data to be processed. In the same system, only one or a few image preprocessing methods are generally used. In the vision system of the present embodiment, the following several image preprocessing methods are used: converting gray scale; image normalization method.

In order to improve the performance of a vision system and enhance the robustness, the collected fruit images need to be preprocessed, wherein the gray level transformation of the images is an important step in preprocessing, and the later image processing speed can be increased on the basis of ensuring that the whole characteristic information of the images is not lost.

The color image is usually in an RGB color mode, the RGB color space is composed of 3 basic colors of R (red), G (green) and B (blue), and the values of R component, G component and B component are all in the range of 0-255. Any color which can be perceived by human eyes in nature can be formed by weighted mixing of the three basic colors, and can be expressed by a color equation. Let C be a certain color in nature, the mathematical expression of the color C is as follows: c=αr+βg+γb. Wherein alpha, beta, gamma E [0,1] is a trichromatic coefficient.

The pixel point is the smallest unit in the image, one pixel point is a small square lattice, and a plurality of pixel points form an image. For example: if there is a red pixel on the computer screen, the RGB three component values of the pixel are: 0,255,0. When the RGB color space is used, the same color system as the hardware can be directly used, so that the image data can be directly used without performing color space conversion. The color image generally adopts an RGB color mode, but the correlation among the three components R, G, B is higher, and the color information can not be effectively provided and the morphological characteristics of the image can not be reflected simply by performing color allocation on the optical principle. Therefore, when processing an image, three components (R: red, G: green, B: blue) of RGB are processed respectively, and a color image is converted into a gray scale, that is, three channels are converted into a single channel, and there are three common methods for converting the gray scale of the image.

Method 1: average method. The average method is to average pixel values of three components of RGB in color image data, and a gray value is obtained through calculation, and the formula is as follows: gray (i, j) = (R (i, j) +G (i, j) +B (i, j)) 3 (2-2).

Method 2: maximum value method. For the three components of RGB in the color image data, the maximum value of the luminance is selected as the gray value of the gray map, and the formula is as follows:

Gray(i,j)＝max{R(i,j)+G(i,j)+B(i,j)} (2-3)

method 3: weighted average method. The weighted average method is the most commonly used gray level conversion method, the human eye is most sensitive to green in the natural environment, the sensitivity to blue is the lowest, and the three components can be weighted and averaged by different weights to obtain a more reasonable gray level image, and the formula is as follows:

Gray(i,j)＝0.299×R(i,j)+0.578×G(i,j)+0.114×B(i,j) (2-4)

in the embodiment, the weighted average method is selected to perform gray level conversion on the fruit image, and the obtained fruit image has the advantages of best gray level image effect and obvious brightness.

Before fruit image data is input into the convolutional neural network for sample training, image normalization processing is needed in an image preprocessing link and is converted into a fixed standard form, so that the image data can be converged more quickly in the training process of the convolutional neural network, and the training efficiency is improved. After the original fruit image data is subjected to image normalization processing, all pixel values of the fruit image data can be mapped onto a [0,1] interval in a unified way, and the following two common methods are adopted:

method 1: min-max normalization (normalization method)

min-max normalization, i.e. processing the original image data using a linear transformation, the processed pixel result values are mapped between [0,1], this method is also called dispersion normalization, and the conversion formula is as follows:

wherein x is _i Represents the image pixel point value, min (x) represents the minimum value of the image pixel, and max (x) represents the maximum value of the image pixel.

Method 2: z-score normalization (normalization method)

The z-score standardization is used as a common data standardization method, the average value of the data processed by the method is 0, the standard deviation is 1, and the data meets the standard normal distribution, and the calculation formula is as follows:

for sequence x ₁ ，x ₂ ，x ₃ ……x _n And (3) performing transformation:wherein (1)>Is the average value of all sample data, and the calculation formula is as follows: />s is the standard deviation of all sample data, and the calculation formula is: />From this, a new sequence y was calculated ₁ ,y ₂ ,y ₃ ……,y _n The mean value is 0, the variance is 1, and the method is dimensionless.

The following are the usage scenarios of the two methods: (a) The z-score normalization method is applicable to cases where the maximum and minimum values of the data are unknown, or where there is outlier data outside the range of values. When processing data using z-score normalization, the distribution of raw data is required to satisfy an approximately gaussian distribution, otherwise the effect becomes poor. (b) In the classification algorithm or the clustering algorithm, the z-score normalization method has better effect when the distance is needed to measure the similarity or when the PCA technology is adopted for dimension reduction. (b) When the data does not accord with the normal too distribution and does not relate to distance measurement and covariance calculation, the normalization can be performed by adopting a min-max normalization method. For example: in the image processing, after the RGB three-channel image is converted into a gray image, the pixel value is limited in the range of [0,255], and the normalization processing can be performed by adopting a min-max normalization method. In summary, in this example, method 1, that is, the min-max normalization method (normalization method) is selected as the method for image normalization during preprocessing of the fruit image, after the normalization of the image, the information of the image itself is not destroyed, the pixel value of the operation result can be obtained, and the value range of the pixel value is converted from [0,255] to [0,1]. This is padded for subsequent training of fruit image data input into the convolutional neural network. The following two benefits are achieved after the normalization of the image: (a) The convergence speed of the convolutional neural network model in training data is improved; (b) And improving the accuracy of the classifier in the convolutional neural network model.

S2, carrying out strawberry fruit recognition on the preprocessed image according to a preset recognition model; and determining the lowest point of the strawberry and the mass center of the strawberry according to the strawberry fruit identification result.

Specifically, in this embodiment, the recognition model needs to be trained in advance. The traditional target detection network is mainly trained based on a plurality of traditional Two-stage networks of the fast-Rcnn algorithm, and for the Two-stage target detection network, the target detection process is mainly completed through a convolutional neural network, CNN convolutional characteristics are extracted, and when the network is trained, the network is mainly trained in Two parts, the first step is to train the RPN network, and the second step is to train the network for target area detection. Although the training accuracy can be higher, the detection rate is usually slower, and the one-stage directly returns to give the category and position information through the backbone network, and the RPN network is not used, so that the speed of the algorithm is improved. In the network of Yolov5, strawberry images are firstly aggregated on different image fine granularity through a backbone network of backbones to form image features, then the image features are mixed and combined through a Neck network, the image features are transmitted to a Prediction layer, and finally the image features are predicted to generate a boundary frame and a Prediction result.

Specifically, in this embodiment, the preset recognition model is a recognition model based on a YOLOv5 network structure; the recognition model is obtained by:

and (3) collecting a manual data set: the manual collection data set is a training basis, and the strawberry fruit pictures of the actual working scene of the robot can be well collected through pictures collected by the binocular camera. In the process of collecting the data set, the problems of light, angle, definition, shielding of blades and the like of the actual working environment of the picking robot are simulated by fully considering the actual situation. The data set is 1046 pictures in total.

Labeling a data set: labeling a data set: in the process of marking the cable dataset, a marking tool adopted by the embodiment is labelimg, and the installation process is as follows: (1) Creating a virtual environment named labelimg in anaconda; (2) Activating a corresponding virtual environment, inputting pip install labelimg in the virtual environment, and then re-activating a corresponding virtual environment input labelimg to obtain an annotated interface; (3) And after the labeling is finished, obtaining a corresponding xml file containing labeling information.

Anchor setting: the anchor information given in the official code of yolo v5 is based on the voc data set, and in order to meet the characteristics of the data set of the cable, anchors are required to be set, and the obtained annotation information is processed by using a k-means algorithm, so that anchors suitable for the strawberry data set are obtained.

And (3) network structure design: the present example employs the network structure of YOLOv 5. Two parameters, depth_multiple and width_multiple, control the Depth and width of the network structure, where Depth_multiple controls the Depth of the network (BottleneckCSP number) and width of the network (convolution kernel number).

Training process: converting the xml file obtained by labeling into a txt format of yolo by using codes, and training by using the yolov5 codes given by authorities, wherein the training process is as follows:

(1) The processes store the marked xml files, the images are the original pictures, the images store the proportions of the training set test set and the verification set, labels is a file in the yolo format after conversion, the division. Py is a code for dividing a written test set training set verification set, and xml- & gt txt. Py is a code for converting a written xml file into a txt file;

(2) Modification of network parameters and modification of data parameters;

(3) Evaluation of results: according to the test result of the test set, the effect of the model is tested, and usually, the test is carried out from an intuitive angle, and most directly, whether strawberries are ripe or not can not be identified under a binocular camera. From the evaluation index of the model, some indexes are generally adopted to evaluate the specific performance of the model;

(4) Modifying the data set and the network parameters: in the training process, if the performance index of the model is not ideal, the data set can be collected again, the original data set can be expanded and enhanced, in addition, the training process can be adjusted, and real-time loss curve analysis is combined.

The traditional target detection network is mainly based on a FAST-RCNN (FAST-RCNN) and some traditional two-stage networks, the training accuracy can be higher, but the detection rate is usually slower, and the one-stage based Yolo v5 network training can well realize the tasks of rapid detection and real-time detection. In order to improve the convergence rate of the model and the accuracy of the classifier in the model, the data is normalized in the data processing stage. In addition, in the training process, the change of the loss curve can be monitored in real time, and the model which is most suitable for detection is obtained at proper time, so that an optimal model is obtained, and further, the recognition accuracy and the recognition effect can be improved. Finally, in this example, the obtained labeling information is processed by using the k-means algorithm, so that an anchor which is relatively suitable for the present dataset, that is, a frame which appears during detection, can be obtained, and a relatively good effect can be achieved during detection.

S3, determining picking point positions according to the lowest point and the mass center; the picking point is a position on the strawberry stems at a preset distance from the calyx.

Specifically, in this example, since the strawberry stems are herb stems, the fruit is supported by the turgor pressure, which is not strong enough to support the strawberry fruit from the bottom. In contrast, strawberries are often sagged, with the stems bending downward. Thus, we determined that the picking point of the strawberry was about one centimeter from the calyx on the stem. One natural approach is to model the position of the stem as a diagonal-if we can identify this line, it is easy to locate the picking spot on it. To determine this line we first need to determine two stationary points on this line. We use the lowest point of the fruit and the centroid of the fruit as two fixed points, which define the gradient of this line. From the segmented image, the nadir and centroid can be easily found. The centroid may be from:obtained. Where f (x, y) returns 1 if the pixel of (x, y) is in the red segment and 0 if it is not in the red segment. Once we have a fade, we use template matching to find the picking point, searching for the green part just above the highest point of the strawberry. We find the image elements that match the template, a small segment of the same line width as a typical strawberry stem, and the same gradient as the line from the nadir to the centroid. The template matching part of our method is shown in figure 2. The initial image is shown in fig. 2 (d). After segmenting the image and identifying the strawberries, the area we search for picking points is shown in fig. 2 (a). This region is chosen to limit picking points to a suitable distance from the calyx. The stem template of this image is shown in FIG. 2 (b). This has been rotated to match the gradient of the line between the centroid and the lowest point of the fruit. The template matching result is shown in fig. 2 (c). This takes into account the degree of matching with the template, as well as the distance from a given point to the calyx. The brighter areas, as a combination of matching and approaching the calyx, are more likely to be a good picking spot. Fig. 2 (c) shows the most likely selected point with the green dot and the red dot in the original image of fig. 2 (d), the dot being on the stem of the mature strawberry.

Since the robot end effector is liable to damage the strawberry fruits, as shown in fig. 3, the robot end effector is replaced by a rectangular box with a blade 31, and when the corresponding fruit coordinates are obtained by the background processing, the robot arm moves the end effector to the corresponding point, places the strawberry 30 in the rectangular box, and the corresponding grabbing action is that the strawberry stems 32 are sheared by shearing of the blade 31, so that the strawberry 30 can be completely taken off.

the system further comprises:

In the scheme of the embodiment of the invention, the provided method and system for displaying and rapidly processing the data applied to the power distribution room comprise the following steps: acquiring CIM model data; extracting geographic coordinates of equipment in the CIM model; distributing points on the SVG graph according to a proportion, extracting relation wiring between node number connection devices, and correspondingly drawing the line and the primitives of the devices to obtain a first drawing image; dividing the distribution line in the first drawing image into N stages from a main line to a branch execution; and carrying out grading correction on the distribution lines in the first drawing image to obtain a data drawing image of the distribution room, and displaying the data drawing image. According to the invention, the rough drawing image is firstly drawn by using CIM model data, and then the grading correction from the main line to the branches is carried out to obtain the accurate image and display the accurate image, so that the method has the reliability, stability and intuitiveness of data processing and display.

In the scheme of the embodiment of the invention, the provided binocular vision-based strawberry picking method and robot system acquire strawberry images and preprocess the strawberry images to obtain preprocessed images; carrying out strawberry fruit identification on the preprocessed image according to a preset identification model; determining the lowest point of the strawberry and the mass center of the strawberry according to the strawberry fruit identification result; determining picking point positions according to the lowest point and the mass center; the picking point is a position on the strawberry stems at a preset distance from the calyx. The method and the device can select the most possible stems from a large number of stems in the image by determining the picking points by utilizing the mass center and the stems of the strawberries, and efficiently pick the strawberries with low damage. In addition, in the training process, the network training of the Yolo v5 based on one-stage can well realize the tasks of quick detection and real-time detection, wherein the one-stage directly returns to give category and position information through a main network without using an RPN (remote procedure network), so that the speed of an algorithm is improved; in the network of Yolov5, strawberry images are firstly aggregated on different image fine granularity through a backbone network of backbones to form image features, then the image features are mixed and combined through a Neck network, the image features are transmitted to a Prediction layer, and finally the image features are predicted to generate a boundary frame and a Prediction result. Therefore, the recognition accuracy of the strawberries and the positioning and distance measurement of the strawberries are improved more rapidly, the contact of the mechanical tail end to the strawberry fruits is effectively avoided, and the damage to the strawberry fruits is avoided.

In addition, the embodiment of the invention also provides a readable storage medium, wherein computer execution instructions are stored in the readable storage medium, and when a processor executes the computer execution instructions, the binocular vision-based strawberry picking method is realized.

While the basic concepts have been described above, it will be apparent to those skilled in the art that the foregoing detailed disclosure is by way of example only and is not intended to be limiting. Although not explicitly described herein, various modifications, improvements, and offsets may occur to those skilled in the art to which this disclosure pertains. Such modifications, improvements, and offset processing are suggested in this specification and, therefore, remain within the spirit and scope of the exemplary embodiments of this specification.

Furthermore, those skilled in the art will appreciate that the various aspects of the specification can be illustrated and described in terms of several patentable categories or circumstances, including any novel and useful procedures, machines, products, or combinations of materials, or any novel and useful modifications thereof. Accordingly, aspects of the present description may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as a "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the specification may take the form of a computer product, comprising computer-readable program code, embodied in one or more computer-readable media.

It is noted that, if the description, definition, and/or use of a term in an attached material in this specification does not conform to or conflict with what is described in this specification, the description, definition, and/or use of the term in this specification controls.

Finally, it should be understood that the embodiments described in this specification are merely illustrative of the principles of the embodiments of this specification. Other variations are possible within the scope of this description. Thus, by way of example, and not limitation, alternative configurations of embodiments of the present specification may be considered as consistent with the teachings of the present specification. Accordingly, the embodiments of the present specification are not limited to only the embodiments explicitly described and depicted in the present specification.

Claims

1. A binocular vision-based strawberry picking method, comprising the steps of:

determining picking point positions according to the lowest point and the mass center; the picking point positions are positions on the strawberry stems at a preset distance from the calyx;

determining the lowest point of strawberry and the mass center of strawberry according to the strawberry fruit identification result, and the mass center of strawberryThe method is obtained by the following formula:

the mass center of the whole strawberry can be calculated by carrying out weighted summation according to a formula;representing a weighted average of the x-coordinate directions, +.>Representing a weighted average of the y coordinates;

further, determining the gradient of a connecting line from the lowest point to the centroid, determining picking points on the connecting line, and then determining the best matching point as the final picking point position according to a gray-scale-based template matching method;

the determining the best matching point as the final picking point position according to the gray-scale-based template matching method comprises the following steps:

and then carrying out fine matching, taking the obtained rough correlation point as a central point, and searching the best matching point between the rough correlation point and the central point by using a least square method so as to serve as the final picking point position.

2. The binocular vision-based strawberry picking method of claim 1, wherein the strawberry fruit recognition of the pre-processed image according to a preset recognition model comprises: the preset recognition model is a recognition model based on a YOLOv5 network structure;

3. The strawberry picking robot system based on binocular vision is characterized in that the robot comprises a binocular vision module and an end execution module; the binocular vision module is used for collecting strawberry images; the end execution module is used for picking strawberries;

the system further comprises:

the picking position determining module is used for determining the picking point position according to the lowest point and the mass center; the picking point positions are positions on the strawberry stems at a preset distance from the calyx;

4. A binocular vision based strawberry picking robot system according to claim 3, wherein the strawberry fruit recognition of the pre-processed image according to a preset recognition model comprises: the preset recognition model is a recognition model based on a YOLOv5 network structure;