CN115841668A

CN115841668A - Binocular vision apple identification and accurate positioning method

Info

Publication number: CN115841668A
Application number: CN202211219935.XA
Authority: CN
Inventors: 韩雪松; 马士豪
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2022-10-08
Filing date: 2022-10-08
Publication date: 2023-03-24

Abstract

The invention discloses a binocular vision apple identification and accurate positioning method, which comprises the following steps: calibrating a binocular camera consisting of two monocular cameras; establishing a model for apple identification; optimizing parameters in an SGBM stereo vision matching algorithm; step four, calculating to obtain the actual centroid position of the apple; the method can accurately obtain the three-dimensional coordinates of the apples in the space under the condition that the apples are not shielded, and can also accurately obtain the three-dimensional coordinates of the apples under the more general condition that the apples are shielded by leaves, branches and the like, so that the positioning accuracy of the apples is ensured, and meanwhile, the algorithm efficiency is improved.

Description

Binocular vision apple identification and accurate positioning method

Technical Field

The invention relates to a method for identifying and positioning mature red apples under binocular vision, in particular to a method for accurately positioning mature red apples under the shielding of leaves and the like.

Background

In apple identification, a plurality of identifications based on the color and shape characteristics of the apples are available, but the identification algorithm based on the texture and color characteristics of the apples has large limitation, cannot find a universal identification model, has low robustness, and is rapidly developed, so that compared with the traditional method, the deep learning method has higher accuracy and advantages; after the rectangular bounding box of the position of the apple is obtained by identifying the apple by using a deep learning method, how to obtain the accurate three-dimensional coordinate of the apple is a precondition key technology for driving a mechanical arm to pick the apple and is also an important factor influencing the accuracy and stability of picking the apple; meanwhile, due to the shielding condition of leaves and the like, if the position information of the apple centroid on the detected image plane is simply adopted to replace the position information of the apple, the collected three-dimensional position information of the leaves is adopted, so that the positioning failure of the apple is caused, and therefore, the situation that part of points on the rectangular boundary frame of the image coordinate system of the apple are simply used as the position information of the apple is very unreliable.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a method for identifying and accurately positioning mature red apples, so that the apples still have high positioning accuracy under the influence of invalid matching points and random error points in leaf occlusion and stereoscopic vision matching.

The invention discloses a binocular vision apple identification and accurate positioning method, which comprises the following steps:

a binocular vision apple identification and accurate positioning method is characterized by comprising the following steps:

step one, calibrating a binocular camera consisting of a left camera and a right camera to obtain an internal reference matrix and a distortion vector of the left camera and the right camera and a translation and rotation pose transformation matrix between the left camera and the right camera:

firstly, connecting a binocular camera to a computer, then respectively shooting a checkerboard pattern calibration plate prepared in advance by using a left camera and a right camera, rotating the calibration plate in the shooting process, enabling the calibration plate to be located at different positions of a visual field in the binocular camera and checkerboard angular points on the calibration plate to be displayed completely, obtaining a plurality of calibration plate pictures shot by the left camera and a plurality of calibration plate pictures shot by the right camera, and enabling the calibration plate pictures shot by the left camera and the calibration plate pictures shot by the right camera to form a pair of calibration plate pictures at the same time;

and step two, importing a plurality of pairs of acquired calibration board pictures into python, performing monocular calibration on a left camera and a right camera and performing binocular calibration on a binocular camera by adopting a Zhang-Yongyou calibration method in an OpenCV library of python to obtain an internal reference matrix and a distortion vector of the left camera and the right camera and a rotation matrix and a translation vector between the left camera and the right camera:

step two, establishing a model for apple identification:

thirdly, optimizing parameters in the SGBM stereo vision matching algorithm, and when the calibration plates in the disparity map of each pair of calibration plate pictures in the step one are clear, recording the parameters corresponding to the SGBM algorithm at the moment as optimized parameters for converting a distortion-corrected left-camera two-dimensional apple picture into a three-dimensional apple picture in the depth measurement process of the apple;

step four, preparing an apple without leaf occlusion as a first apple, and obtaining the actual centroid position of the first apple through calculation, wherein the specific calculation process is as follows:

the first step, a first apple is placed in the visual field of a left camera, so that the z coordinate of the estimated center of mass of the first apple in the left camera coordinate system is close to the z coordinate of the center of the calibration plate in the left camera coordinate system when the calibration plate picture is shot in the first step;

secondly, respectively shooting a first apple picture by adopting a left camera and a right camera, and then carrying out distortion correction on the first apple pictures shot by the left camera and the right camera;

thirdly, introducing a model inference core library IECore of OpenVINO into python to perform model inference on the model identified by the apple obtained in the second step to obtain a network structure and weight parameters of a deep learning neural network, then transmitting a first apple picture shot by a left camera after distortion correction into a Yolov4 target detection algorithm to identify the apple in the picture, and generating a rectangular bounding box tangent to the outline of the first apple;

and step four, performing stereoscopic vision matching on a first apple picture shot by the distortion corrected left camera and a first apple picture shot by the distortion corrected right camera by adopting an SGBM stereoscopic vision matching algorithm which has completed parameter optimization in the step three, and expanding the two-dimensional picture of the first apple into a three-dimensional depth picture of the first apple:

fifthly, cutting the first apple picture shot by the left camera after distortion correction in the second step along the rectangular boundary frame to obtain a color image in the rectangular boundary frame in the first apple picture, and extracting a red area in the color image to obtain a binary image after the red area is extracted;

sixthly, randomly and coarsely sampling three-dimensional points in a rectangular bounding box in the first apple three-dimensional depth picture by using a randint function in a random library of python to obtain a three-dimensional coarse sample point set, wherein the three-dimensional coarse sample point set is screened and used for fitting an apple sphere;

and seventhly, fine screening is carried out on the coarse sample point set obtained in the sixth step, and the specific steps are as follows:

step 701: calculating the mean and variance of the coordinate values of the z-axis of the coarse sample points in the coarse sample point set:

step 702: for a coarse sample set of points P _i I =1 \ 8230n, each coarse sample point P in n _i (x _i ,y _i ,z _i ) Setting a screening interval [ z ] _i -σ _z ,z _i +σ _z ]Judging whether the z coordinate value of each coarse sample point in the coarse sample point set falls on the coarse sample point P _i (x _i ,y _i ,z _i ) In the screening interval of (1), the total number of sample points falling in the interval is score [ i ]]，score[i]The score of the ith point is obtained; wherein σ _z -coarse sample point setVariance of z-axis coordinate values of the medium and coarse sample points; z is a radical of _i -a z-coordinate value of the ith coarse sample point in the set of coarse sample points;

step 703: comparing the scores of all the coarse sample points, wherein the point set obtained by screening the coarse sample point with the highest score is the new sample point set { P) obtained by fine screening _j } (j =1 8230N), denoted I _k N is I _k The number of new sample points;

eighth step, using RANSAC algorithm to I _k Further screening to obtain the optimal inner point for spherical fitting of the apple;

and ninthly, randomly selecting 8 samples from the optimal inner point set, and performing spherical fitting on the first apple by using a matrix form of a least square method to obtain the coordinate of the centroid of the first apple in the left camera coordinate system and the radius of the spherical model of the first apple.

Compared with other prior art, the method has the following benefits:

1. the applicability is wide: the method can accurately obtain the three-dimensional coordinates of the apples in the space under the condition that the apples are not shielded, and can still accurately obtain the three-dimensional coordinates of the apples under the more general condition that the apples are shielded by leaves, branches and the like, so that the method has universality and practicability in a complex orchard environment, and meanwhile, the apples can be accurately positioned by the method as long as the positions of rectangular boundary frames of the apples in pictures can be determined by identifying the apples, and the method is not limited to the method for identifying the apples by using the yolov4 target detection algorithm;

2. efficiency and precision are high: according to the method, all three-dimensional point cloud drawing points on the apples are not extracted to fit the apples, a small number of three-dimensional points on the apples are selected and sampled, error points which are not successfully matched in stereoscopic vision matching and error points caused by foreground shielding and background interference of the apples are eliminated through screening and optimization of the algorithm, meanwhile, the apples are completely segmented by pseudo segmentation through rectangular frames obtained by identifying the apples in apple segmentation, and the efficiency of the algorithm is improved while the positioning accuracy of the apples is guaranteed.

Drawings

Fig. 1 is an image of a first apple identified in a left camera picture after distortion correction without leaf occlusion.

Fig. 2 is an image of a second apple identified with leaf occlusion in the left camera picture after distortion correction.

Fig. 3 is a binary image obtained after extracting a red region in a rectangular bounding box of the first cut apple.

Fig. 4 is a binary image obtained after extracting a red region in a rectangular bounding box of the second apple cut out.

Fig. 5 is a flowchart of the procedure for identifying and accurately positioning apples.

Fig. 6 is a flowchart of the seventh step of the above steps.

Fig. 7 is a flowchart of the program of the eighth step of the above-mentioned steps.

Detailed Description

Specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

firstly, connecting a binocular camera to a computer, then respectively shooting a checkerboard pattern calibration plate prepared in advance by using a left camera and a right camera, rotating the calibration plate in the shooting process, enabling the calibration plate to be located at different positions of a visual field in the binocular camera and enabling checkerboard angular points on the calibration plate to be displayed completely, obtaining a plurality of calibration plate pictures shot by the left camera and a plurality of calibration plate pictures shot by the right camera, and enabling the calibration plate pictures shot by the left camera and the calibration plate pictures shot by the right camera to form a pair of calibration plate pictures at the same moment, wherein if 15 pairs of calibration plate pictures can be shot;

hardware equipment used in the specific implementation of the invention comprises an industrial binocular camera, a computer and two apples for identification and positioning;

the method for manufacturing the checkerboard calibration plate prepared in advance may be as follows:

the checkerboard paper is flatly attached to an acrylic substrate to obtain a checkerboard pattern calibration plate for monocular camera calibration and binocular camera calibration:

the size of the checkerboards and the size of each check on the calibration plate can be selected according to an application scene, the selection principle is that the working distance from the optical center of the left camera to a set point on the apple stem under the camera coordinate system of the left camera is determined firstly, then the length of the X axis and the length of the Y axis of the field of view of the left camera at the distance are calculated, and the size of the calibration plate is selected to be 1/4 to 1/3 of the length of the X axis and the length of the Y axis; such as: the checkerboard size selects a checkerboard with 7 rows and 10 columns, the size of each square is 25mmx25mm, and the checkerboard is printed on A4 paper under the condition that the printing precision is 300 DPI;

secondly, importing a plurality of pairs of acquired Calibration board pictures into Python, adopting a Zhang friend Calibration method (specifically, see the method in Zhang Z.A Flexible New Technique for Camera Calibration [ J ]. IEEE Transactions on Pattern Analysis and Camera Calibration, 2000,22 (11): 1330-1334. (a New Technique for flexibly obtaining Camera Calibration) in an OpenCV library of Python (OpenCV is a cross-platform computer vision and Machine learning software library issued based on Apache2.0 license, provides an interface of Python language, and includes a plurality of general algorithms in image processing and computer vision) to perform monocular Calibration on a left Camera and a right Camera and perform binocular Calibration on a binocular Camera, thereby obtaining an internal reference matrix and a distortion vector of the left Camera and the right Camera, and a rotation matrix and a translation vector between the left Camera and the right Camera:

the monocular camera calibration and the binocular camera calibration are realized through an OpenCV library of Python (see in particular "OpenCV 4 detailed description based on Python" pages 291-309 published by 9-month people post and telecommunications press 2021), openCV provides rich APIs for checkerboard corner detection and monocular binocular camera calibration, and can conveniently realize the camera calibration, if the following devices are selected: the industrial binocular camera selects a camera of blue sky science and technology, the model is LT-USB1080P, the model of a lens of the camera is LT-C146MM-3MP, the focal length is 6MM, the model of a CPU of a used computer is AMD Ryzen 7 5800H, the model of a display card is NVIDIA GeForce GTX 1650, and the radius of the used apple is about 40MM; the software part mainly comprises an apple identification program, an image processing program and an SGBM algorithm stereoscopic vision matching program.

Then the respective internal reference matrix and distortion vector of the two cameras and the rotation matrix and translation vector between the two cameras obtained after monocular calibration and binocular calibration are as follows:

internal reference matrix of left camera:

internal reference matrix of right camera:

distortion vector of left camera: dist1= [ -0.52 0.5974 0]

Distortion vector of right camera: dist2= [ -0.5178 0.4581 0]

Rotation matrix between two cameras:

translation vector between two cameras: t = [ -129.2953-0.6675.0062 ]

Step two, establishing a model for apple identification:

the apple identification model is an existing open model, the author of the model is AlexeyAB, the model is disclosed in a Github warehouse, and the concrete website ishttps://github.com/AlexeyAB/darknet/ releases。

Step three, optimizing parameters in an SGBM stereo vision matching algorithm (the SGBM algorithm can be specifically referred to as a method described in Hirschlmuller H. Stereo processing by semi-global matching and spatial information [ J ]. IEEE Transactions on pattern analysis and machine interaction, 2007,30 (2): 328-341. (stereo processing based on semi-global matching and mutual information)), and when the calibration plates in the disparity map of each pair of calibration plate pictures in the step one are clear, recording the parameters corresponding to the SGBM algorithm at the moment as optimized parameters for converting a distortion-corrected left-camera two-dimensional apple picture into a three-dimensional apple picture in the depth measurement process of the apple;

as an embodiment of the present invention, the specific process of optimizing parameters in the SGBM stereo vision matching algorithm may include:

step 301: reading each pair of calibration board pictures shot in the step one;

step 302: distortion correction is respectively carried out on each pair of calibration plate pictures (a specific distortion correction method is shown in page 298 of "OpenCV 4 detailed Python-based" published by 9-month people post and telecommunications press in 2021), corrected undistorted calibration plate pictures are obtained, distortion correction of the calibration plate pictures is achieved through OpenCV, internal parameter matrixes and distortion vectors of a left camera and a right camera are obtained in the first using step, a mapping matrix needed by image correction is calculated through an API provided by OpenCV, and then distortion in an original image is removed through the mapping matrix;

step 303: initializing an SGBM algorithm, and performing parallax calculation and depth calculation on a calibration board picture shot by a left camera and a calibration board picture shot by a right camera in each pair of calibration board pictures subjected to distortion correction in the step 302 to obtain a plurality of parallax images;

step 304: and continuously changing parameters of the SGBM algorithm, observing the effect of each disparity map, and recording the parameters of the SGBM algorithm at the moment when all the disparity maps are clear for subsequent depth measurement.

Step four, the specific flow is as shown in fig. 5, an apple without leaf occlusion is prepared as the first apple, and the actual centroid position of the first apple is obtained through calculation, wherein the specific calculation process is as follows:

first, a first apple is placed in the visual field of the left camera, so that the z coordinate of the estimated centroid of the first apple in the left camera coordinate system is close to the z coordinate of the center of the calibration board in the left camera coordinate system when the calibration board picture is taken in the first step, and the difference of the z coordinates is usually within 200mm, for example: placing the first apple in the left camera coordinate system according to the estimated centroid position, wherein the coordinates of the estimated centroid position in the left camera coordinate system are (0 mm,0mm and 1000mm), and taking the half of the length width height of the apple as the estimated position of the centroid of the apple;

thirdly, introducing a model inference core library IECore of OpenVINO into python to perform model inference on the model identified by the apple obtained in the second step to obtain a network structure and weight parameters of a deep learning neural network, then transmitting a first apple picture shot by a left camera after distortion correction into a Yolov4 target detection algorithm (the Yolov4 algorithm can be specifically referred to Bochkovskiy A, wang C Y, liao H Y M. Yolov4: optimal speed and accuracy of object detection [ J ]. ArXiv prediction arXiv:2004.10934,2020 (Yolov 4: optimal speed and accuracy of target detection)), identifying the apple in the picture, generating a rectangular bounding box tangent to the outline of the first apple, and identifying the first apple picture as shown in FIG. 1;

in the step, the SGBM algorithm is used for carrying out three-dimensional matching on the corrected left camera picture and the right camera picture, the SGBM algorithm is used as a semi-global matching algorithm, the three-dimensional matching effect is obviously better than that of a local matching algorithm, the high precision can be realized, and the requirement of accurate positioning of the apples is met; because the SGBM algorithm obtains a non-optimal result, parameters need to be adjusted to obtain a good enough algorithm effect, the parameters are obtained by optimization in the third step, parallax calculation is carried out on the first apple pictures shot by the left camera and the right camera after distortion correction in the fourth step and the second step, a three-dimensional depth map can be further obtained, and the z axis in the three-dimensional depth map is depth information;

fifthly, cutting the first apple picture shot by the left camera after distortion correction in the second step along the rectangular boundary frame to obtain a color image in the rectangular boundary frame in the first apple picture, and extracting a red area in the color image to obtain a binary image after extraction of the red area;

as an embodiment of the present invention, a specific process of obtaining the binarized image is as follows:

step 501: cutting out a color image in the rectangular bounding box, and converting the color image into an HSV color space;

step 502: in the HSV color space described in step 501, the H component (hue) is less affected by illumination, so that the method of setting the threshold of the H component to segment the red region in the color image is more reliable under the complex illumination condition in the orchard, the H component corresponding to the red region has two ranges, namely 0-10 and 156-180, the threshold is set, and two binary images are obtained by screening;

step 503: the two binary images obtained in step 502 are superimposed to obtain a new binary image, white (with a pixel value of 255) in the new binary image corresponds to red in the color image cropped in step 501, black (with a pixel value of 0) in the binary image corresponds to other colors in the color image cropped in step 501, and a binary image obtained by extracting a red region from the rectangular bounding box of the first apple identified in fig. 1 is shown in fig. 3.

And sixthly, randomly and coarsely sampling three-dimensional points in a rectangular bounding box in the first apple three-dimensional depth picture by using a randint function in a random library of python to obtain a three-dimensional coarse sample point set, wherein the three-dimensional coarse sample point set is used for fitting the apple sphere after being screened.

The specific steps of the random rough sampling may be as follows:

step 601: randomly sampling a binary image obtained after extraction of the red area in the image 3 to obtain an x coordinate and a y coordinate of a sampling point in a pixel coordinate system;

step 602: judging the pixel value of the binary image corresponding to the coordinate of the sampling point, if the pixel value is 255, selecting a red area, and continuously judging the stereoscopic vision matching to obtainWhether the z-axis coordinate value corresponding to the pixel point on the three-dimensional depth map is in a set interval or not is determined by the working distance recognized by the apple, and if so, the selection of the set interval is determined by the following steps: if the working distance of apple recognition is 500-2000 mm, judging whether the z-axis coordinate value corresponding to the pixel point on the three-dimensional depth image obtained by stereo vision matching is 500-2000 mm, if so, storing the three-dimensional coordinate point as a three-dimensional coarse sample point P of apple spherical surface fitting _i Otherwise, abandoning the sample point, and then executing the next step;

step 603: returning to the step 601 to obtain the two-dimensional coordinates in the pixel coordinate system, and then continuing to execute the steps 602 to 603 to obtain the three-dimensional coarse sample point P _i (x _i ,y _i ,z _i ) Stopping sampling until a sufficient number of coarse sample points are collected to obtain a three-dimensional coarse sample point set { P } _i (i =1 \ 8230n), considering the accuracy and efficiency of the algorithm, for example: in this example, sampling is stopped after 100 coarse sample points are sampled, and a coarse sample point set { P } is obtained _i } (i =1 8230100); the sample points obtained by rough sampling can effectively avoid sampling three-dimensional points corresponding to non-red areas in the rectangular bounding box and wrong three-dimensional points (the z value of the three-dimensional coordinate is infinite) at the cavities caused by mismatching in the SGBM algorithm.

Seventhly, fine screening is carried out on the coarse sample point set obtained in the sixth step:

the coarse sample point set is subjected to fine screening, namely three-dimensional points of red areas on a background, which are presented by four corners of a cut rectangular picture, are screened and removed through an algorithm, the red areas can also become white areas in a binary image through color segmentation, so that the coarse sample point set can possibly have wrong sample points, and coordinate values of the wrong sample points and a Z axis of a correct sample point have a large outlier characteristic relative to coordinate values of the X axis and the Y axis, so that the wrong sample points can be effectively removed through screening of the coordinate values of the Z axis, and the screening is performed aiming at the coordinate values of the Z axis, and the specific steps are as follows:

step 701: calculating the mean and variance of the coordinate values of the z axis of the coarse sample points in the coarse sample point set:

/>

μ _z -average of z-axis coordinate values of coarse sample points in a set of coarse sample points

σ _z -variance of z-axis coordinate values of coarse sample points in the set of coarse sample points

z _i -z-coordinate value of ith coarse sample point in coarse sample point set

n-number of coarse sample points in the set of coarse sample points

Step 702: for a coarse sample set of points P _i I =1 \ 8230n (e.g.: { P: } _i Each coarse sample point P in (i =1 \ 8230100) } (i = 1)) _i (x _i ,y _i ,z _i ) Setting a screening interval [ z ] _i -σ _z ,z _i +σ _z ]Judging whether the z coordinate value of each coarse sample point in the coarse sample point set falls on the coarse sample point P _i (x _i ,y _i ,z _i ) The total number of sample points scored in the interval is score [ i ] within the screening interval of (1)]，score[i]The score of the ith point is obtained;

eighth step, RANSAC algorithm is used ((see in particular Choi S, kim T, yuW. Performance evaluation of RANSAC family [ J ])]Journal of Computer Vision,1997,24 (3): 271-300 (evaluation of the performance of the RANSAC Algorithm))) vs. I _k Further screening to obtain the optimal inner point for spherical fitting of the apple;

as an embodiment of the present invention, the specific steps of acquiring the optimal interior point by using the RANSAC algorithm are as follows:

step 801: i obtained from the seventh step _k In the method, 4 sample points are randomly selected, and because the model to be solved has 4 unknown parameters, the minimum number of sample points needed for initializing the model is 4, and the minimum number of sample points is twoFitting the spherical surface of the first apple in a matrix form of multiplication to obtain the coordinates (a, b, c) of the centroid of the first apple and the radius r of the first apple;

step 802: traverse I _k Calculating the nearest distance from each sample point to the spherical surface to obtain d _j (j＝1…N)；

Step 803: make a judgment if d _j Less than or equal to 10 (mm), the corresponding sample point P is marked _j As an interior point, storing in a sample consistent set S;

step 804: the number of interior points in the sample consensus set S obtained in step 803 is l; if l > 0.9N, a sample consensus set S is obtained, the interior points { T } in which _i And (i =1 \ 8230l) is the best fitting internal point set, otherwise, the step 801 is returned to, 4 sample points are randomly selected again, and iteration is continued until a sample consistent set S is obtained.

And ninthly, considering precision and efficiency, randomly selecting 8 samples from the optimal inner point set, and performing first apple spherical fitting by using a matrix form of a least square method to obtain a coordinate of a centroid of the first apple in a left camera coordinate system and a radius of the first apple spherical model.

The specific principle of performing spherical fitting in a matrix form by using a least square method is as follows:

assuming that the first apple model is a spherical space, the equation is:

(x-a) ² +(y-b) ² +(z-c) ² ＝r ²

(x, y, z) -coordinate points on a sphere;

(a, b, c) -coordinates of the centroid of the first apple in the left camera coordinate system

r-radius of the first apple spherical model;

unfolding the obtained product to obtain:

-2xa-2yb-2zc+(a ² +b ² +c ² -r ² )＝-x ² -y ² -z ²

conversion to matrix form gives:

A ₁ ＝-2x,A ₂ ＝-2y,A ₃ ＝-2z,A ₄ ＝1,d＝a ² +b ² +c ² -r ² ,e＝-x ² -y ² -z ² ；

when there are 8 samples (x) _i ,y _i ,z _i ) (i =1 \ 8230; 8) when subjected to spherical fitting, the conversion is to the form a β = e, wherein:

further left and right simultaneously by A ^T Obtaining:

A ^T Aβ＝A ^T e

last two simultaneous left multiplication (A) ^T A) ^-1 Obtaining:

β＝(A ^T A) ^-1 *(A ^T e)

according to an example of step 603, the mean and variance of the sample points of the first apple are calculated to be μ through the seventh step _z ＝964.0626mmσ _z =10.8990mm, and the number of fine sample dots after screening was 68. The optimal inner point set of the first apple obtained by the eighth step of calculation has 66 sample points. And performing ninth-step apple spherical fitting, and when the estimated center of mass of the first apple is placed at the position of a left camera coordinate system (0 mm, 1000mm), calculating to obtain the center of mass coordinate of the first apple (2.3428 mm-5.7377mm, 995.1694mm), wherein the radius is r =44.4725mm, the positioning precision can meet the requirement of picking the apple, and the accurate positioning of the center of mass of the apple can be realized.

Verification of the examples:

replacing the first apple in the first step with an apple with leaves for blocking, recording the apple with leaves for being a second apple, repeating the second to ninth steps of the fourth step, wherein the picture of the second apple identified in the third step is shown in fig. 2, the binary image obtained in the fifth step is shown in fig. 4, and the mean and variance of the sample points of the second apple in the seventh step are μ _z ＝977.8539mmσ _z =19.0748mm, of second apple finally obtainedThe centroid coordinate is (6.48 mm-1.14mm 997.9085mm), the radius is r =44.3907mm, accurate positioning under the condition that leaves are shielded can be achieved, and the accuracy requirement of apple picking can be met.

Claims

1. A binocular vision apple identification and accurate positioning method is characterized by comprising the following steps:

and step two, importing a plurality of pairs of collected calibration board pictures into python, performing monocular calibration on a left camera and a right camera and performing binocular calibration on a binocular camera by adopting a Zhang-Yongyou calibration method in an OpenCV library of python to obtain an internal reference matrix and a distortion vector of the left camera and the right camera and a rotation matrix and a translation vector between the left camera and the right camera:

step two, establishing a model for apple identification:

sixthly, randomly and coarsely sampling three-dimensional points in a rectangular bounding box in the first apple three-dimensional depth picture by using a randint function in a random library of python to obtain a three-dimensional coarse sample point set, wherein the three-dimensional coarse sample point set is screened and used for fitting the apple sphere;

and seventhly, finely screening the coarse sample point set obtained in the sixth step, wherein the method comprises the following specific steps:

step 702: for a coarse sample set of points P _i I =1 \ 8230n, each coarse sample point P in n _i (x _i ,y _i ,z _i ) Setting a screening interval [ z ] _i -σ _z ,z _i +σ _z ]Judging whether the z coordinate value of each coarse sample point in the coarse sample point set falls on the coarse sample point P _i (x _i ,y _i ,z _i ) The total number of sample points scored in the interval is score [ i ] within the screening interval of (1)]，score[i]The score of the ith point is obtained; wherein σ _z -variance of z-axis coordinate values of coarse sample points in the set of coarse sample points; z is a radical of _i -a z-coordinate value of the ith coarse sample point in the set of coarse sample points;

step 703: comparing the scores of all the coarse sample points, wherein the point set obtained by screening the coarse sample point with the highest score is the new sample point set { P) obtained by fine screening _j J =1 8230and N is I _k N is I _k The number of new sample points;

2. The binocular vision apple identification and accurate positioning method according to claim 1, wherein the specific process of optimizing the parameters in the SGBM stereo vision matching algorithm comprises:

step 301: reading each pair of calibration board pictures shot in the step one;

step 302: obtaining internal parameter matrixes and distortion vectors of the left camera and the right camera in the first step, calculating a mapping matrix required by image correction through an API (application program interface) provided by OpenCV (open computer vision library), and removing distortion in each pair of calibration board pictures by using the mapping matrix to obtain corrected undistorted calibration board pictures;

step 304: and continuously changing parameters of the SGBM algorithm, observing the effect of each disparity map, and recording the parameters of the SGBM algorithm at the moment when all disparity maps are clear.

3. The binocular vision apple identification and accurate positioning method according to claim 1 or 2, wherein the specific process of obtaining the binary image is as follows:

step 502: in the HSV color space described in step 501, setting a threshold of an H component to segment a red region in a color image, where the H component corresponding to the red region has two ranges, 0-10 and 156-180, and then screening to obtain two binary images;

step 503: and superposing the two binary images obtained in the step 502 to obtain a new binary image, wherein white in the new binary image corresponds to red in the color image cut out in the step 501, and black in the binary image corresponds to other colors in the color image cut out in the step 501.

4. The binocular vision apple identification and accurate positioning method of claim 3, wherein: the random rough sampling comprises the following specific steps:

step 601: randomly sampling from a binary image obtained after extraction of the red area to obtain an x coordinate and a y coordinate of a sampling point in a pixel coordinate system;

step 602: judging the pixel value of the binary image corresponding to the coordinate of the sampling point, if the pixel value is 255, selecting a red area, continuously judging whether the z-axis coordinate value corresponding to the pixel point on the three-dimensional depth image obtained by stereoscopic vision matching is between the set intervals, and if the z-axis coordinate value is between the set intervalsBetween fixed intervals, the three-dimensional coordinate points are stored as three-dimensional rough sample points P of the spherical surface fitting of the apple _i Otherwise, abandoning the sample point, and then executing the next step;

step 603: returning to the step 601 to obtain the two-dimensional coordinates in the pixel coordinate system, and then continuing to execute the steps 602 to 603 to obtain the three-dimensional coarse sample point P _i (x _i ,y _i ,z _i ) Stopping sampling until a sufficient number of coarse sample points are collected to obtain a three-dimensional coarse sample point set { P } _i },i＝1…n。

5. The binocular vision apple identification and accurate positioning method of claim 4, wherein: the method for acquiring the optimal interior point by adopting the RANSAC algorithm comprises the following specific steps:

step 801: i obtained from the seventh step of the fourth step _k Randomly selecting 4 sample points, and fitting the spherical surface of the first apple by using a matrix form of a least square method to obtain the centroid coordinates (a, b and c) of the first apple and the radius r of the first apple;

step 802: traverse I _k Calculating the nearest distance from each sample point to the spherical surface to obtain d _j ,j＝1…N；

Step 803: make a judgment if d _j Less than or equal to 10 (mm), the corresponding sample point P is determined _j As an interior point, storing in a sample consistent set S;

step 804: the number of interior points in the sample consensus set S obtained in step 803 is l; if l>0.9N, a sample consensus set S is obtained, the interior points of which { T } _i I =1 \ 8230j, i is the best fitting internal point set, otherwise, the step 801 is returned to, 4 sample points are randomly selected again to continue iteration until a sample consistent set S is obtained;

and ninthly, randomly selecting 8 samples from the optimal inner point set, and performing first apple spherical surface fitting by using a matrix form of a least square method.