CN114049557A - Garbage sorting robot visual identification method based on deep learning - Google Patents

Garbage sorting robot visual identification method based on deep learning Download PDF

Info

Publication number
CN114049557A
CN114049557A CN202111323743.9A CN202111323743A CN114049557A CN 114049557 A CN114049557 A CN 114049557A CN 202111323743 A CN202111323743 A CN 202111323743A CN 114049557 A CN114049557 A CN 114049557A
Authority
CN
China
Prior art keywords
target
camera
data
values
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111323743.9A
Other languages
Chinese (zh)
Inventor
严圣军
刘德峰
梅文豪
倪玮玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhiying Robot Technology Co ltd
Jiangsu Tianying Environmental Protection Energy Equipment Co Ltd
China Tianying Inc
Original Assignee
Shanghai Zhiying Robot Technology Co ltd
Jiangsu Tianying Environmental Protection Energy Equipment Co Ltd
China Tianying Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhiying Robot Technology Co ltd, Jiangsu Tianying Environmental Protection Energy Equipment Co Ltd, China Tianying Inc filed Critical Shanghai Zhiying Robot Technology Co ltd
Priority to CN202111323743.9A priority Critical patent/CN114049557A/en
Publication of CN114049557A publication Critical patent/CN114049557A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration

Abstract

The invention discloses a garbage sorting robot visual identification method based on deep learning, wherein a 2D area-array camera and a 3D line-array camera complete synchronous calibration under the same world coordinate system; establishing a YOLOv4 target detection model; starting multithreading, acquiring a pulse value of an encoder in real time, and simultaneously acquiring image data by using a 2D camera and a 3D camera; and acquiring current contour image data in the memory to obtain actual height information and width information of the target object, and sending the actual height information and width information together with the rectangular frame center coordinate, the rectangular frame width, the rectangular frame height and the target type obtained by the YOLOv4 target detection model to the garbage sorting robot, so that the robot realizes real-time online grabbing and classification. According to the method, the images of the 2D camera and the 3D camera are synchronously acquired, the encoder pulse value corresponds the target image detected by the YOLOv4 detection model with the 3D scanning data, and the target height and width information is obtained.

Description

Garbage sorting robot visual identification method based on deep learning
Technical Field
The invention relates to a visual identification method, in particular to a visual identification method of a garbage sorting robot based on deep learning, and belongs to the field of visual identification of garbage sorting robots.
Background
The robot picks materials and mainly finishes visual identification through front-end images, for a two-dimensional object, target materials can be collected through a 2D camera, and three-dimensional materials are required to be finished by a 3D camera due to reasons such as inconsistent shapes and sizes of the three-dimensional materials. Get object height information among traditional 3D camera and mainly carry out object height information's acquisition through installing laser displacement sensor at fixed position, although laser displacement sensor straight line performance is very good, and the accuracy is high, nevertheless in the data send process, because the unstable factor is more in the object motion process, often leads to the robot to snatch efficiency not high, and what more can lead to the robot to take place empty grabbing, collision. On the basis, the 2D camera and the 3D camera are generally fused for recognition, but the two cameras must ensure that the visual field areas of the object are the same when the object passes through the 2 cameras, and if the two cameras are not aligned, the positioning accuracy of the robot is not accurate. The existing technology can not well fuse the 2D camera and the 3D camera to achieve complete synchronization of data.
Disclosure of Invention
The invention aims to solve the technical problem of providing a garbage sorting robot visual identification method based on deep learning, and realizing synchronous fusion of data of a 2D camera and data of a 3D camera.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a garbage sorting robot visual identification method based on deep learning is characterized by comprising the following steps:
the method comprises the following steps: the 2D area-array camera and the 3D line-array camera complete synchronous calibration under the same world coordinate system;
step two: establishing a YOLOv4 target detection model;
step three: starting multithreading, acquiring a pulse value of an encoder in real time, simultaneously acquiring image data by a 2D camera and a 3D camera, acquiring an RGB (red, green and blue) image to be identified and detected by the 2D camera, and acquiring the outline and height information of a target image by the 3D camera;
step four: inputting the RGB image into a YOLO v4 target detection model to obtain a rectangular frame center coordinate, a rectangular frame width, a rectangular frame height and a target type;
step five: the 3D camera circularly acquires single-contour images, and data are sequentially stored in a memory with a specified size, wherein the memory occupation size of the single-contour images is determined by the pulse number of the encoder;
step six: and acquiring current contour image data in the memory to obtain actual height information and width information of the target object, and sending the actual height information and width information together with the rectangular frame center coordinate, the rectangular frame width, the rectangular frame height and the target type obtained by the YOLOv4 target detection model to the garbage sorting robot, so that the robot realizes real-time online grabbing and classification.
Further, the calibration method of the 2D area-array camera in the first step includes:
collecting chessboard calibration plates at different positions and rotation angles by using a camera;
performing internal reference calibration by adopting Matlab to obtain an internal reference matrix and a distortion coefficient;
determining a world coordinate system on a chessboard calibration plate, determining pixel coordinates of 4 angular points on the calibration plate and world coordinates corresponding to the 4 angular points under the determined world coordinate system by a PNP 4-point calibration method, and performing external reference calibration on the 4 angular points by a solvePNP operator to obtain a rotation matrix and a translation matrix of the external reference so as to finish the external reference calibration.
Further, the calculation formula for converting the pixel coordinates to the world coordinates is as follows:
Figure BDA0003346218090000031
wherein: u and v are respectively a pixel abscissa and a pixel ordinate in a pixel coordinate system; x is the number ofw、yw、zwRespectively an abscissa, an ordinate and a vertical coordinate in a world coordinate system; r is a rotation matrix; t is a translation matrix; u. of0、v0、fx、fyIs a camera internal parameter, i.e. u0And v0Respectively the image center abscissa and the image center ordinate, fxAnd fyRespectively a transverse equivalent focal length and a longitudinal equivalent focal length; s is the camera coordinate in the camera coordinate system.
Further, the calibration method of the 3D line camera comprises:
irradiating the calibration plate by using a 3D laser to enable the laser to be parallel to an x axis of a fixed world coordinate system;
turning off the laser, improving the exposure time, and collecting a picture with clear angular points;
after the picture is collected, opening the laser, and then advancing the conveyor belt for a certain distance to enable the laser to fall on the other checkerboard and enable the laser to be parallel to the x axis;
closing the laser, improving the exposure time, and collecting another picture with clear angular points;
and finally, calibrating, and finishing the calibration of the 3D camera after the data storage is finished.
Further, the second step is specifically: the method comprises the steps of collecting RGB images by using a 2D camera, marking acquired picture information by marking personnel, carrying out model training by using a YOLOv4 target detection model, and generating a final YOLOv4 target detection model.
Further, the fourth step is specifically:
inputting the collected RGB image into a YOLOv4 target detection model with an input size of 608X 608 to obtain a list of all position frames Bounding Box with target materials in the image, and filtering by a non-maximum suppression NMS algorithm to obtain the coordinate position information of the target garbage points which need to be reserved finally;
the non-maximum suppression (NMS) algorithm is as follows:
Figure BDA0003346218090000041
wherein Si represents the score of each frame, M represents the frame with the highest current score, bi represents a certain frame of the rest frames, Nt is a set NMS threshold, and IOU is the proportion of the overlapping area of the two identification frames;
when the Yolov4 target detection model detects a target value, acquiring a pulse value of a current encoder in real time, wherein the acquired pulse value is a pulse value of a target material center coordinate, taking the height of a rectangular frame detected by the Yolov4 target detection model as the moving direction of a conveyor belt, and sending the pulse value of the current encoder, the detected target center coordinate, and the width and height information of the rectangular frame to a data analysis processing module together.
Further, the fifth step is specifically:
the data analysis processing module calculates the initial position of the target material in the 3D storage memory, and the specific calculation formula is as follows:
Figure BDA0003346218090000042
wherein A represents the current encoder pulse value; b represents an initial starting pulse value; 1216 denotes 1216 scan points on one contour; 3 represents 3 coordinate values of x, y and z; 4 represents that the x, y and z values of each scanning point respectively occupy 4 bytes;
circularly acquiring single-contour images of the 3D linear array camera, sequentially storing data in a memory with a specified size, wherein a specific calculation formula of the size of a stored byte is as follows:
Figure BDA0003346218090000043
wherein 1.6mm represents the world coordinate distance between the two scanned contours; 1216 denotes 1216 scan points on one contour; 3 represents 3 coordinate values of x/y/z; 4 indicates that the x/y/z values of each scanning point each occupy 4 bytes;
and finally, comprehensively converting according to the initial position of the target material in the 3D storage memory and the size of the storage bytes of the target material to obtain the final end position of the target material in the 3D storage memory, wherein the specific calculation formula is as follows:
final end position-memory initial position + storage byte size
And reading the corresponding target 3D scanning memory data in the data analysis processing module, namely finishing reading the x, y and z values of the target material in the 3D storage memory data in the three-dimensional world.
Further, the world coordinate distance of 1.6mm between the two contours is from the following sources:
the 3D camera scans a profile with 5 pulses, and a profile has 1216 points, each point includes x/y/z values, each value occupies 4 bytes, one encoder pulse value is 1000, the distance of the belt is 320mm when the belt travels one circle, that is, one pulse distance is 0.32mm, and one profile is scanned every 5 pulses, that is, 5 × 0.32 ═ 1.6mm, so that the distance between the two profiles is 1.6mm from the corresponding world coordinate.
Further, when the number of the 3D acquisition profiles is full of 50000, the 3D data stores data again from the initial position of the memory space, the data stored before are covered, the data are sequentially stored backwards, the data are stored in an infinite loop mode, and when the program stops running, the opened memory space is released, so that the memory is prevented from overflowing or leaking.
Further, the sixth step is specifically:
acquiring 3D memory data, taking x and y values as row and column pixel coordinates corresponding to Mat in OpenCV, carrying out normalization processing on z values from 0 to 255, taking the values obtained by the normalization processing as gray values corresponding to the x and y pixel coordinates in the Mat, extracting the gray values in the corresponding Mat according to the pixel values of the target center coordinates by the height values, and carrying out inverse normalization processing through the gray values to obtain the actual height information of the target;
processing the Mat image by using algorithms such as OpenCV thresholding segmentation, open-close operation, minimum contour processing and the like, and extracting width information of a target object;
and finally, the central coordinate position of the target object obtained by 2D extraction, the target type and the target height information and width information obtained by 3D extraction are integrally sent to the robot, so that the robot can realize real-time online grabbing and classification.
Compared with the prior art, the invention has the following advantages and effects:
1. according to the method, the images of the 2D camera and the 3D camera are synchronously acquired, the pulse value of the encoder corresponds the target image detected by the YOLOv4 detection model with the 3D scanning data, and the height and width information of the target is solved, so that the method is high in detection speed and high in identification precision;
2. according to the invention, a cyclic 3D memory data storage system is formed in the data reading process, so that not only is the memory not overflowed, but also the memory space and the encoder can be combined to read the data of the corresponding memory space in real time;
3. according to the invention, the deep learning YOLO target detection model is applied to the 2D camera and 3D camera fusion technology, the OpenCV is used for processing the memory data, the real-time grabbing accuracy of the robot is higher, and the robustness is stronger.
Drawings
Fig. 1 is a flowchart of a visual recognition method of a garbage sorting robot based on deep learning according to the present invention.
Fig. 2 is a schematic flow chart of the camera synchronization calibration of the present invention.
Fig. 3 is a schematic diagram of a calibration board before calibration of a 3D camera according to the present invention.
Fig. 4 is a schematic diagram of the calibration board after calibration of the 3D camera according to the present invention.
FIG. 5 is a flowchart of OpenCV data processing and sending the processing result to a robot to implement grabbing according to the present invention.
Fig. 6 is a schematic diagram of a target width AB obtained from the memory data corresponding to the OpenCV processing target according to the present invention.
Detailed Description
To elaborate on technical solutions adopted by the present invention to achieve predetermined technical objects, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, it is obvious that the described embodiments are only partial embodiments of the present invention, not all embodiments, and technical means or technical features in the embodiments of the present invention may be replaced without creative efforts, and the present invention will be described in detail below with reference to the drawings and in conjunction with the embodiments.
As shown in fig. 1, the visual recognition method of the garbage sorting robot based on deep learning of the present invention includes the following steps:
the method comprises the following steps: and the 2D area-array camera and the 3D line-array camera complete synchronous calibration under the same world coordinate system.
As shown in fig. 2, the calibration method of the 2D area-array camera includes:
collecting chessboard calibration plates at different positions and rotation angles by using a camera;
performing internal reference calibration by adopting Matlab to obtain an internal reference matrix and a distortion coefficient;
determining a world coordinate system on a chessboard calibration plate, determining pixel coordinates of 4 angular points on the calibration plate and world coordinates corresponding to the 4 angular points under the determined world coordinate system by a PNP 4-point calibration method, and performing external reference calibration on the 4 angular points by a solvePNP operator to obtain a rotation matrix and a translation matrix of the external reference so as to finish the external reference calibration.
Further, the calculation formula for converting the pixel coordinates to the world coordinates is as follows:
Figure BDA0003346218090000081
wherein: u and v are respectively a pixel abscissa and a pixel ordinate in a pixel coordinate system; x is the number ofw、yw、zwRespectively an abscissa, an ordinate and a vertical coordinate in a world coordinate system; r is a rotation matrix; t is a translation matrix; u. of0、v0、fx、fyIs a camera internal parameter, i.e. u0And v0Respectively the image center abscissa and the image center ordinate, fxAnd fyRespectively a transverse equivalent focal length and a longitudinal equivalent focal length; s is the camera coordinate in the camera coordinate system.
The calibration method of the 3D linear array camera comprises the following steps:
irradiating the calibration plate by using a 3D laser to enable the laser to be parallel to an x axis of a fixed world coordinate system;
as shown in fig. 3, turning off the laser, increasing the exposure time, and collecting a picture with clear corner points;
after the picture is collected, opening the laser, and then advancing the conveyor belt for a certain distance to enable the laser to fall on the other checkerboard and enable the laser to be parallel to the x axis;
as shown in fig. 4, the laser is turned off, the exposure time is increased, and another picture with clear corner points is acquired;
and finally, calibrating the 3D camera after the data storage is finished, wherein the calibration process of 4 angular points is the same as the 2D calibration process.
Step two: establishing a YOLOv4 target detection model; the method comprises the steps of collecting RGB images by using a 2D camera, marking acquired picture information by marking personnel, carrying out model training by using a YOLOv4 target detection model, and generating a final YOLOv4 target detection model.
Step three: starting multithreading, acquiring a pulse value of an encoder in real time, simultaneously acquiring image data by a 2D camera and a 3D camera, acquiring an RGB (red, green and blue) image to be identified and detected by the 2D camera, and acquiring the outline and height information of a target image by the 3D camera;
step four: inputting the RGB image into a YOLO v4 target detection model to obtain the center coordinates of a rectangular frame, the width of the rectangular frame, the height of the rectangular frame and the type of a target.
Inputting the collected RGB image into a YOLOv4 target detection model with an input size of 608X 608 to obtain a list of all position frames Bounding Box with target materials in the image, and filtering by a non-maximum suppression NMS algorithm to obtain the coordinate position information of the target garbage points which need to be reserved finally;
the non-maximum suppression (NMS) algorithm is as follows:
Figure BDA0003346218090000091
wherein Si represents the score of each frame, M represents the frame with the highest current score, bi represents a certain frame of the rest frames, Nt is a set NMS threshold, and IOU is the proportion of the overlapping area of the two identification frames;
when the Yolov4 target detection model detects a target value, acquiring a pulse value of a current encoder in real time, wherein the acquired pulse value is a pulse value of a target material center coordinate, taking the height of a rectangular frame detected by the Yolov4 target detection model as the moving direction of a conveyor belt, and sending the pulse value of the current encoder, the detected target center coordinate, and the width and height information of the rectangular frame to a data analysis processing module together.
Step five: the 3D camera circularly acquires single-contour images, data are sequentially stored in a memory with a specified size, and the memory occupation size of the single-contour images is determined by the pulse number of the encoder.
The data analysis processing module calculates the initial position of the target material in the 3D storage memory, and the specific calculation formula is as follows:
Figure BDA0003346218090000101
wherein A represents the current encoder pulse value; b represents an initial starting pulse value; 1216 denotes 1216 scan points on one contour; 3 represents 3 coordinate values of x, y and z; 4 represents that the x, y and z values of each scanning point respectively occupy 4 bytes;
circularly acquiring single-contour images of the 3D linear array camera, sequentially storing data in a memory with a specified size, wherein a specific calculation formula of the size of a stored byte is as follows:
Figure BDA0003346218090000102
wherein 1.6mm represents the world coordinate distance between the two scanned contours; 1216 denotes 1216 scan points on one contour; 3 represents 3 coordinate values of x/y/z; 4 indicates that the x/y/z values of each scanning point each occupy 4 bytes;
and finally, comprehensively converting according to the initial position of the target material in the 3D storage memory and the size of the storage bytes of the target material to obtain the final end position of the target material in the 3D storage memory, wherein the specific calculation formula is as follows:
final end position-memory initial position + storage byte size
And reading the corresponding target 3D scanning memory data in the data analysis processing module, namely finishing reading the x, y and z values of the target material in the 3D storage memory data in the three-dimensional world.
The world coordinate distance of 1.6mm between the two contours is from the following sources:
the 3D camera scans a profile with 5 pulses, and a profile has 1216 points, each point includes x/y/z values, each value occupies 4 bytes, one encoder pulse value is 1000, the distance of the belt is 320mm when the belt travels one circle, that is, one pulse distance is 0.32mm, and one profile is scanned every 5 pulses, that is, 5 × 0.32 ═ 1.6mm, so that the distance between the two profiles is 1.6mm from the corresponding world coordinate.
According to the invention, a cyclic 3D memory data storage system is formed in the data reading process, when 50000 data are full of 3D acquisition outlines, the 3D data stores data again from the initial position of the memory space, the previously stored data is covered, the data are sequentially stored backwards and continuously, the infinite cyclic storage of the data is carried out, and when the program stops running, the opened memory space is released, so that the overflow or leakage of the memory is prevented.
Step six: and acquiring current contour image data in the memory to obtain actual height information and width information of the target object, and sending the actual height information and width information together with the rectangular frame center coordinate, the rectangular frame width, the rectangular frame height and the target type obtained by the YOLOv4 target detection model to the garbage sorting robot, so that the robot realizes real-time online grabbing and classification.
As shown in fig. 5, acquiring and obtaining 3D memory data, taking x and y values as row and column pixel coordinates corresponding to Mat in OpenCV, performing normalization processing of 0 to 255 on z values, taking values obtained by the normalization processing as gray values corresponding to the x and y pixel coordinates in Mat, extracting gray values in the Mat corresponding to the height values according to pixel values of target center coordinates, and performing inverse normalization processing through the gray values to obtain actual height information of the target object;
processing the Mat image by using algorithms such as OpenCV thresholding segmentation, open-close operation, minimum contour processing and the like, and extracting width information of a target object; as shown in fig. 6, contour information is extracted for the corresponding garbage object, wherein the length of the point AB is the width information of this object.
And finally, the central coordinate position of the target object obtained by 2D extraction, the target type and the target height information and width information obtained by 3D extraction are integrally sent to the robot, so that the robot can realize real-time online grabbing and classification.
According to the method, the images of the 2D camera and the 3D camera are synchronously acquired, the pulse value of the encoder corresponds the target image detected by the YOLOv4 detection model with the 3D scanning data, and the height and width information of the target is solved, so that the method is high in detection speed and high in identification precision; according to the invention, a cyclic 3D memory data storage system is formed in the data reading process, so that not only is the memory not overflowed, but also the memory space and the encoder can be combined to read the data of the corresponding memory space in real time; according to the invention, the deep learning YOLO target detection model is applied to the 2D camera and 3D camera fusion technology, the OpenCV is used for processing the memory data, the real-time grabbing accuracy of the robot is higher, and the robustness is stronger.
Although the present invention has been described with reference to a preferred embodiment, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A garbage sorting robot visual identification method based on deep learning is characterized by comprising the following steps:
the method comprises the following steps: the 2D area-array camera and the 3D line-array camera complete synchronous calibration under the same world coordinate system;
step two: establishing a YOLOv4 target detection model;
step three: starting multithreading, acquiring a pulse value of an encoder in real time, simultaneously acquiring image data by a 2D camera and a 3D camera, acquiring an RGB (red, green and blue) image to be identified and detected by the 2D camera, and acquiring the outline and height information of a target image by the 3D camera;
step four: inputting the RGB image into a YOLO v4 target detection model to obtain a rectangular frame center coordinate, a rectangular frame width, a rectangular frame height and a target type;
step five: the 3D camera circularly acquires single-contour images, and data are sequentially stored in a memory with a specified size, wherein the memory occupation size of the single-contour images is determined by the pulse number of the encoder;
step six: and acquiring current contour image data in the memory to obtain actual height information and width information of the target object, and sending the actual height information and width information together with the rectangular frame center coordinate, the rectangular frame width, the rectangular frame height and the target type obtained by the YOLOv4 target detection model to the garbage sorting robot, so that the robot realizes real-time online grabbing and classification.
2. The visual recognition method of the garbage sorting robot based on deep learning of claim 1, wherein: the calibration method of the 2D area-array camera in the first step comprises the following steps:
collecting chessboard calibration plates at different positions and rotation angles by using a camera;
performing internal reference calibration by adopting Matlab to obtain an internal reference matrix and a distortion coefficient;
determining a world coordinate system on a chessboard calibration plate, determining pixel coordinates of 4 angular points on the calibration plate and world coordinates corresponding to the 4 angular points under the determined world coordinate system by a PNP 4-point calibration method, and performing external reference calibration on the 4 angular points by a solvePNP operator to obtain a rotation matrix and a translation matrix of the external reference so as to finish the external reference calibration.
3. The visual recognition method of the garbage sorting robot based on deep learning of claim 1, wherein: the calculation formula for converting pixel coordinates to world coordinates is as follows:
Figure FDA0003346218080000021
wherein: u and v are respectively a pixel abscissa and a pixel ordinate in a pixel coordinate system; x is the number ofw、yw、zwRespectively an abscissa, an ordinate and a vertical coordinate in a world coordinate system; r is a rotation matrix; t is a translation matrix; u. of0、v0、fx、fyIs a camera internal parameter, i.e. u0And v0Respectively the image center abscissa and the image center ordinate, fxAnd fyRespectively a transverse equivalent focal length and a longitudinal equivalent focal length; s is the camera coordinate in the camera coordinate system.
4. The visual recognition method of the garbage sorting robot based on deep learning of claim 2, wherein: the calibration method of the 3D linear array camera comprises the following steps:
irradiating the calibration plate by using a 3D laser to enable the laser to be parallel to an x axis of a fixed world coordinate system;
turning off the laser, improving the exposure time, and collecting a picture with clear angular points;
after the picture is collected, opening the laser, and then advancing the conveyor belt for a certain distance to enable the laser to fall on the other checkerboard and enable the laser to be parallel to the x axis;
closing the laser, improving the exposure time, and collecting another picture with clear angular points;
and finally, calibrating, and finishing the calibration of the 3D camera after the data storage is finished.
5. The visual recognition method of the garbage sorting robot based on deep learning of claim 1, wherein: the second step is specifically as follows: the method comprises the steps of collecting RGB images by using a 2D camera, marking acquired picture information by marking personnel, carrying out model training by using a YOLOv4 target detection model, and generating a final YOLOv4 target detection model.
6. The visual recognition method of the garbage sorting robot based on deep learning of claim 1, wherein: the fourth step is specifically as follows:
inputting the collected RGB image into a YOLOv4 target detection model with an input size of 608X 608 to obtain a list of all position frames Bounding Box with target materials in the image, and filtering by a non-maximum suppression NMS algorithm to obtain the coordinate position information of the target garbage points which need to be reserved finally;
the non-maximum suppression (NMS) algorithm is as follows:
Figure FDA0003346218080000031
wherein Si represents the score of each frame, M represents the frame with the highest current score, bi represents a certain frame of the rest frames, Nt is a set NMS threshold, and IOU is the proportion of the overlapping area of the two identification frames;
when the Yolov4 target detection model detects a target value, acquiring a pulse value of a current encoder in real time, wherein the acquired pulse value is a pulse value of a target material center coordinate, taking the height of a rectangular frame detected by the Yolov4 target detection model as the moving direction of a conveyor belt, and sending the pulse value of the current encoder, the detected target center coordinate, and the width and height information of the rectangular frame to a data analysis processing module together.
7. The visual recognition method of the garbage sorting robot based on deep learning of claim 1, wherein: the fifth step is specifically as follows:
the data analysis processing module calculates the initial position of the target material in the 3D storage memory, and the specific calculation formula is as follows:
Figure FDA0003346218080000041
wherein A represents the current encoder pulse value; b represents an initial starting pulse value; 1216 denotes 1216 scan points on one contour; 3 represents 3 coordinate values of x, y and z; 4 represents that the x, y and z values of each scanning point respectively occupy 4 bytes;
circularly acquiring single-contour images of the 3D linear array camera, sequentially storing data in a memory with a specified size, wherein a specific calculation formula of the size of a stored byte is as follows:
Figure FDA0003346218080000042
wherein 1.6mm represents the world coordinate distance between the two scanned contours; 1216 denotes 1216 scan points on one contour; 3 represents 3 coordinate values of x/y/z; 4 indicates that the x/y/z values of each scanning point each occupy 4 bytes;
and finally, comprehensively converting according to the initial position of the target material in the 3D storage memory and the size of the storage bytes of the target material to obtain the final end position of the target material in the 3D storage memory, wherein the specific calculation formula is as follows:
final end position-memory initial position + storage byte size
And reading the corresponding target 3D scanning memory data in the data analysis processing module, namely finishing reading the x, y and z values of the target material in the 3D storage memory data in the three-dimensional world.
8. The visual recognition method of the garbage sorting robot based on deep learning of claim 7, wherein: the world coordinate distance of 1.6mm between the two contours is from the following sources:
the 3D camera scans a profile with 5 pulses, and a profile has 1216 points, each point includes x/y/z values, each value occupies 4 bytes, one encoder pulse value is 1000, the distance of the belt is 320mm when the belt travels one circle, that is, one pulse distance is 0.32mm, and one profile is scanned every 5 pulses, that is, 5 × 0.32 ═ 1.6mm, so that the distance between the two profiles is 1.6mm from the corresponding world coordinate.
9. The visual recognition method of the garbage sorting robot based on deep learning of claim 7, wherein: when the number of the 3D acquisition outlines is full of 50000, the 3D data stores data again from the initial position of the memory space, the data stored before are covered, the data are sequentially stored backwards, the data are stored in an infinite loop mode, when the program stops running, the opened memory space is released, and the memory is prevented from overflowing or leaking.
10. The visual recognition method of the garbage sorting robot based on deep learning of claim 1, wherein: the sixth step is specifically as follows:
acquiring 3D memory data, taking x and y values as row and column pixel coordinates corresponding to Mat in OpenCV, carrying out normalization processing on z values from 0 to 255, taking the values obtained by the normalization processing as gray values corresponding to the x and y pixel coordinates in the Mat, extracting the gray values in the corresponding Mat according to the pixel values of the target center coordinates by the height values, and carrying out inverse normalization processing through the gray values to obtain the actual height information of the target;
processing the Mat image by using algorithms such as OpenCV thresholding segmentation, open-close operation, minimum contour processing and the like, and extracting width information of a target object;
and finally, the central coordinate position of the target object obtained by 2D extraction, the target type and the target height information and width information obtained by 3D extraction are integrally sent to the robot, so that the robot can realize real-time online grabbing and classification.
CN202111323743.9A 2021-11-10 2021-11-10 Garbage sorting robot visual identification method based on deep learning Pending CN114049557A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111323743.9A CN114049557A (en) 2021-11-10 2021-11-10 Garbage sorting robot visual identification method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111323743.9A CN114049557A (en) 2021-11-10 2021-11-10 Garbage sorting robot visual identification method based on deep learning

Publications (1)

Publication Number Publication Date
CN114049557A true CN114049557A (en) 2022-02-15

Family

ID=80207948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111323743.9A Pending CN114049557A (en) 2021-11-10 2021-11-10 Garbage sorting robot visual identification method based on deep learning

Country Status (1)

Country Link
CN (1) CN114049557A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114821283A (en) * 2022-06-29 2022-07-29 山东施卫普环保科技有限公司 Sweeper garbage sweeping method and system based on visual perception
CN115100492A (en) * 2022-08-26 2022-09-23 摩尔线程智能科技(北京)有限责任公司 Yolov3 network training and PCB surface defect detection method and device
CN115296738A (en) * 2022-07-28 2022-11-04 吉林大学 Unmanned aerial vehicle visible light camera communication method and system based on deep learning
CN115546566A (en) * 2022-11-24 2022-12-30 杭州心识宇宙科技有限公司 Intelligent body interaction method, device, equipment and storage medium based on article identification
CN116921247A (en) * 2023-09-15 2023-10-24 北京安麒智能科技有限公司 Control method of intelligent garbage sorting system
CN117124302A (en) * 2023-10-24 2023-11-28 季华实验室 Part sorting method and device, electronic equipment and storage medium

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114821283A (en) * 2022-06-29 2022-07-29 山东施卫普环保科技有限公司 Sweeper garbage sweeping method and system based on visual perception
CN115296738A (en) * 2022-07-28 2022-11-04 吉林大学 Unmanned aerial vehicle visible light camera communication method and system based on deep learning
CN115296738B (en) * 2022-07-28 2024-04-16 吉林大学 Deep learning-based unmanned aerial vehicle visible light camera communication method and system
CN115100492A (en) * 2022-08-26 2022-09-23 摩尔线程智能科技(北京)有限责任公司 Yolov3 network training and PCB surface defect detection method and device
CN115100492B (en) * 2022-08-26 2023-04-07 摩尔线程智能科技(北京)有限责任公司 Yolov3 network training and PCB surface defect detection method and device
CN115546566A (en) * 2022-11-24 2022-12-30 杭州心识宇宙科技有限公司 Intelligent body interaction method, device, equipment and storage medium based on article identification
CN116921247A (en) * 2023-09-15 2023-10-24 北京安麒智能科技有限公司 Control method of intelligent garbage sorting system
CN116921247B (en) * 2023-09-15 2023-12-12 北京安麒智能科技有限公司 Control method of intelligent garbage sorting system
CN117124302A (en) * 2023-10-24 2023-11-28 季华实验室 Part sorting method and device, electronic equipment and storage medium
CN117124302B (en) * 2023-10-24 2024-02-13 季华实验室 Part sorting method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN114049557A (en) Garbage sorting robot visual identification method based on deep learning
CN111951237B (en) Visual appearance detection method
CN108555908B (en) Stacked workpiece posture recognition and pickup method based on RGBD camera
CN110580725A (en) Box sorting method and system based on RGB-D camera
CN111612765B (en) Method for identifying and positioning round transparent lens
CN112067233B (en) Six-degree-of-freedom motion capture method for wind tunnel model
CN110084243B (en) File identification and positioning method based on two-dimensional code and monocular camera
CN113177565B (en) Binocular vision position measuring system and method based on deep learning
CN105678682B (en) A kind of bianry image connected region information fast acquiring system and method based on FPGA
CN112816418B (en) Mobile phone metal middle frame defect imaging system and detection method
CN113870267B (en) Defect detection method, defect detection device, computer equipment and readable storage medium
CN106651849A (en) Area-array camera-based PCB bare board defect detection method
US9245375B2 (en) Active lighting for stereo reconstruction of edges
CN113554757A (en) Three-dimensional reconstruction method and system for workpiece track based on digital twinning
CN115131268A (en) Automatic welding system based on image feature extraction and three-dimensional model matching
CN111626241A (en) Face detection method and device
CN104614372B (en) Detection method of solar silicon wafer
CN108182700B (en) Image registration method based on two-time feature detection
CN111724432B (en) Object three-dimensional detection method and device
CN113256568A (en) Machine vision plate counting general system and method based on deep learning
CN110136248B (en) Transmission shell three-dimensional reconstruction device and method based on binocular stereoscopic vision
CN116188763A (en) Method for measuring carton identification positioning and placement angle based on YOLOv5
CN112347904B (en) Living body detection method, device and medium based on binocular depth and picture structure
KR100276445B1 (en) Property recognition apparatus
Li et al. Structured light based high precision 3D measurement and workpiece pose estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination