CN114155301A - Robot target positioning and grabbing method based on Mask R-CNN and binocular camera - Google Patents

Robot target positioning and grabbing method based on Mask R-CNN and binocular camera Download PDF

Info

Publication number
CN114155301A
CN114155301A CN202111401496.XA CN202111401496A CN114155301A CN 114155301 A CN114155301 A CN 114155301A CN 202111401496 A CN202111401496 A CN 202111401496A CN 114155301 A CN114155301 A CN 114155301A
Authority
CN
China
Prior art keywords
target
mask
target object
image
camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111401496.XA
Other languages
Chinese (zh)
Inventor
周登科
史凯特
汤鹏
于傲
郑开元
张亚平
李哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Three Gorges Corp
Original Assignee
China Three Gorges Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Three Gorges Corp filed Critical China Three Gorges Corp
Priority to CN202111401496.XA priority Critical patent/CN114155301A/en
Publication of CN114155301A publication Critical patent/CN114155301A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Abstract

A robot target positioning and grabbing method based on Mask R-CNN and a binocular camera comprises the following steps: step 1: calibrating the camera; step 2: identifying and segmenting the target; and step 3: positioning a target; and 4, step 4: calculating the pose of the target; and 5: grabbing the target; in step 2, the RGB images collected by the binocular camera are sent into a convolutional neural network Mask R-CNN model which is trained in advance, and the position of the target object in the images and the Mask are output through the model. In step 3, aligning the RGB image and the DEPTH image collected by the camera, and calculating the average distance between the pixel points in the Mask area of the target object and the binocular camera to be the distance between the target object and the camera. In step 4, pixel coordinates and depth information are converted into a robot base coordinate system through robot hand-eye calibration, the posture of each joint angle is solved through robot inverse kinematics, and the robot is driven to move to complete a grabbing and carrying task.

Description

Robot target positioning and grabbing method based on Mask R-CNN and binocular camera
Technical Field
The invention belongs to the technical field of robots, and particularly relates to a robot target positioning and grabbing method based on Mask R-CNN and a binocular camera.
Background
The robot is a bionic intelligent machine, generally has the moving capability, the sensing capability, the action capability and the coordination capability, can complete specific recognition tasks and action tasks by being provided with a sensing lens and a mechanical arm, and can recognize and grab specified target objects. The current common target grabbing method is to search for a target through a camera carried by a robot, extract a target object through a template matching method or other image processing methods after the target is found, calculate the distance between the target object and the camera through a binocular camera or a laser radar carried by the robot, and finally control a mechanical arm to grab the target object.
In the method, different sensors are used for detecting and ranging the target object respectively, equipment requirements and robot load are increased, in the target detection, the traditional image processing method is large in error, the edge information of the target object cannot be accurately determined, and meanwhile, the processing speed is low, so that the action of the mechanical arm is delayed, and the requirement for grabbing the target in real time is difficult to meet. In the target ranging, background information is complex, noise points are not completely filtered, and the distance measurement precision is poor.
Disclosure of Invention
The invention aims to solve the technical problems of large target detection error and low distance measurement precision of the existing robot in a complex environment, and provides a method which can realize accurate positioning and grabbing of the robot to a target object and reduce the error rate of grabbing the target object by a mechanical arm of the robot.
A robot target positioning and grabbing method based on Mask R-CNN and a binocular camera comprises the following steps:
step 1: calibrating the camera;
step 2: identifying and segmenting the target;
and step 3: positioning a target;
and 4, step 4: calculating the pose of the target;
and 5: grabbing the target;
in step 2, the RGB image collected by the binocular camera is used as an input image and sent into a pre-trained convolutional neural network Mask R-CNN model, a detection frame and a Mask of a target object in the image are output through the model, a target area is extracted by carrying out pixel point segmentation on the target object, and background interference information is filtered.
In step 2, identifying and segmenting the target object by using a Mask R-CNN model, wherein the model construction and identification steps are as follows:
2-1) acquiring a target object data set, and acquiring images of the target object from different environments, different angles, different brightness and different postures according to the type of the target object to be captured;
2-2) data enhancement, because the available data set samples are less, the data set is expanded by adopting a method of combining traditional image geometric transformation data enhancement and generation type data enhancement by utilizing GAN. For a traditional image geometric transformation method, the data set is expanded by operations of random cutting, horizontal turning, image inclination, noise addition, image scaling and the like on the acquired data set through brightness transformation, noise addition, shearing, rotation and the like;
2-3) labeling the target data set, and labeling the acquired image by using an image labeling tool;
2-4) optimizing a Mask RCNN network model;
2-5) carrying out model transfer learning training, loading the self-made data set into the optimized network model by using a transfer learning method, and simultaneously loading a model pre-trained by using a COCO data set, so as to improve model convergence, and carrying out iterative training on the model by parameter optimization to generate a target detection and segmentation model;
2-6) detecting and segmenting the target, intercepting an RGB image from a video stream shot by a binocular camera, transmitting the RGB image into a Mask R-CNN model, identifying the type and the position of a target object to be captured through the model, segmenting the target, and outputting a Mask region of the target.
In the step 2-4), when optimizing the Mask R-CNN network model, the following steps are adopted:
(1) modifying the feature extraction network, and improving the speed of target identification by reducing network layers;
(2) modifying the RPN area proposed network, modifying the size of an anchor point, concentrating the model in a specified proportion for calculation, eliminating anchor blocks exceeding the size of the original image, and further screening by a non-maximum suppression (NMS) method to obtain an interested area;
(3) modifying the loss function; mask R-CNN loss function of
L=Lcls+Lbox+Lmask
Wherein L isclsAs a function of classification loss, LboxTo detect the loss function, LmaskFor dividing the loss function, at LmaskAdding a boundary loss function, and regularizing the segmented position, shape and continuity by using distance loss to enable the segmented position, shape and continuity to be closer to a target boundary; l ismask-edgeThe optimized loss function is
Figure BDA0003365047890000021
Wherein L isedgeFor the boundary loss function, y denotes the annotated target edge,
Figure BDA0003365047890000031
representing the prediction boundary, alpha is a weight coefficient, B is the boundary of the segmentation result, MdistDistance transformation for the group-treth segmentation boundary.
In step 1, calibrating a camera to acquire three-dimensional spatial position information through two-dimensional image information; the method specifically comprises the following steps:
1-1) making a calibration plate;
1-2) collecting images, changing the position and the angle of a calibration plate relative to a camera, and shooting a plurality of pictures of the calibration plate from different angles, different positions and different postures by using the camera to be calibrated;
1-3) detecting the calibration board angular point to obtain the pixel coordinate value of the calibration board angular point, and calculating the physical coordinate value of the calibration board angular point according to the known size of the checkerboard and the origin of a world coordinate system;
1-4) solving the internal parameters and the external parameters of the camera.
In step 3, when the target is positioned, aligning a DEPTH image acquired by the binocular camera with an RGB image, and calculating an average distance from a target Mask area pixel point in the RGB image to an infrared lens of the binocular camera, thereby obtaining a distance from the target object to the camera. Meanwhile, the width of the target is measured through a binocular camera ranging principle, and whether the target object can be grabbed by the opening and closing width of the mechanical arm grabber is judged.
In step 3, the method specifically comprises the following steps:
3-1) acquiring an RGB image and a DEPTH image;
3-2) aligning the DEPTH image and the RGB image, so that pixel points in the RGB image correspond to the DEPTH image target points one by one;
3-3) filtering the background information of the target image, and filtering the background area of the image according to the Mask area of the target object output in the step 2-4);
and 3-4) calculating a target distance, calculating the distance from a pixel point in a Mask area of the target object to the binocular camera, and calculating the average value of the pixel point distances to obtain the distance from the target object to the camera.
3-5) calculating the target width, and calculating the maximum width of the edge of the target object in the DEPTH image corresponding to the Mask region edge pixels as the width of the target object. And judging whether the target is in the grabbing range of the mechanical arm according to the calculated width value.
In step 4, to grab the target object, the pose of the target object in the camera coordinate system needs to be converted into the pose of the target object in the robot arm base coordinate system. The process is realized through hand-eye calibration, the hand-eye calibration is calibrated by a nine-point calibration method, and the coordinate conversion relation between a camera coordinate system and a robot base coordinate system can be determined, so that the position of a target workpiece relative to the robot base coordinate system is calculated, and the robot is guided to realize grabbing.
In the step 5, the robot is driven to reach the area where the target is located, the distance between the robot and the target object is adjusted, the target object can be captured by a clamping device at the tail end of the mechanical arm, the angular postures of the joints are solved through inverse kinematics of the robot, and finally the target object is captured by controlling the rotation angle of the joints of the mechanical arm.
Compared with the prior art, the invention has the following technical effects:
1) the robot simultaneously acquires the RGB image and the DEPTH image of the target object through the binocular camera, and the positioning and grabbing of the robot mechanical arm on the target object are realized by simultaneously processing the RGB image and the DEPTH image;
2) in the target object identification and segmentation, an improved MaskR-CNN-based target identification and segmentation method is provided, and the speed and the precision of target identification are improved by modifying a feature extraction network, a region suggestion network and a loss function of a model.
3) A binocular camera fusion DEPTH learning algorithm is provided in target positioning, automatic identification and pixel level segmentation of a target object are achieved, background noise information is filtered through Mask segmentation, pixel point distance information in a Mask area is calculated by combining a DEPTH image, target positioning precision is improved, and accuracy is improved for accurate grabbing of a subsequent mechanical arm.
Drawings
The invention is further illustrated by the following examples in conjunction with the accompanying drawings:
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flowchart of a method for constructing a Mask R-CNN model according to the present invention;
FIG. 3 is a flowchart of a depth camera-based target ranging method according to the present invention.
Detailed Description
As shown in fig. 1, a flowchart of a robot target positioning and grabbing method based on Mask R-CNN and a binocular camera specifically includes the following steps:
step 1, calibrating a camera. The method comprises the steps of obtaining three-dimensional space position information through two-dimensional image information, obtaining internal parameters of a vision model on the premise of knowing the internal parameters, and obtaining the internal and external parameters by calibrating a camera. The internal reference and the external reference of the camera can be directly obtained by using a Zhangyingyou calibration method. As shown in the following formula
Figure BDA0003365047890000041
Wherein, a is an internal parameter matrix of the camera, R is a rotation matrix of the external parameter, T is a translation vector, α ═ f/dx, β ═ f/dy, f is a focal length, and dx and dy are width and height of the pixel respectively; gamma represents the deviation of pixel point in x, y direction and (u)0,v0) Is a reference point.
And 2, identifying and segmenting the target. The method uses a RealSense D435i structured light camera as a visual perception camera of the robot, simultaneously acquires RGB images and DEPTH images by using RealSense D435i, inputs the RGB images into a pre-trained convolutional neural network Mask R-CNN model, outputs the position of a target object in the images through the model, and performs pixel point segmentation on the target object to extract a target area. The flow of the Mask R-CNN model construction method provided by the invention is shown in figure 2.
And 3, positioning the target. Aligning the DEPTH image with the RGB image, and calculating the average distance between a target Mask area pixel point in the RGB image and an infrared lens of a binocular camera, thereby obtaining the distance of a target object. The flow of the target positioning method based on the RealSense binocular camera is shown in fig. 3.
And 4, calculating the pose of the target. To know the coordinate position of the target object relative to the robot arm, it is first necessary to convert the robot coordinate system and the target object coordinate system. And establishing a space coordinate system by taking the arm shoulder of the robot mechanical arm as a coordinate origin, and converting the coordinate value of the target object in the coordinate system taking the camera as the origin into the coordinate value in the coordinate system. And performing hand-eye calibration by using a nine-point calibration method, and determining a coordinate conversion relation between a camera coordinate system and a robot base coordinate system so as to calculate the position of the target workpiece relative to the robot base coordinate system.
And 5, capturing the target. The mobile robot reaches the area where the target is located, the distance between the robot and the target object is adjusted, the target object can be in the capture range of the gripper at the tail end of the mechanical arm, and the mutual relation between every two connecting rods or the pose relation relative to the base reference coordinate system can be described through the homogeneous transformation matrix of the connecting rod coordinate systems. In the implementation scheme, an eye-in-hand system is adopted, a conversion matrix from a camera coordinate system to a manipulator end effector coordinate system is obtained by completing corresponding hand-eye calibration, the angular postures of all joints are solved through robot inverse kinematics, and finally the target object is grabbed by controlling the rotation angle of all joints of a Ur manipulator.
In step 2, the Mask R-CNN model construction and detection specifically comprises the following steps:
step 2.1, acquiring a target object data set, and acquiring images of the target object from different environments, different angles, different brightness and different postures according to the type of the target object to be captured;
and 2.2, enhancing data, namely expanding the data set by adopting a method of combining traditional image geometric transformation data enhancement and generation type data enhancement by utilizing GAN (gamma-N) because fewer data set samples can be obtained. For a traditional image geometric transformation method, the data set is expanded by operations of random cutting, horizontal turning, image inclination, noise addition, image scaling and the like on the acquired data set through brightness transformation, noise addition, shearing, rotation and the like;
and 2.3, labeling the target data set, and labeling the acquired image by using an image labeling tool LabelIme, wherein the purpose of labeling is to improve the detection precision of the model on the target object by using supervised training.
Step 2.4, optimizing a Mask R-CNN network model, (1) changing a feature extraction network into ResNet50, wherein a binocular camera carried by the robot is closer to a target object, the object is clearer, deep feature extraction network extraction features are not needed, and the speed of target identification is increased by reducing network levels; (2) the RPN area recommends network modification, the sizes of the anchor boxes adopt 32 × 32, 64 × 64, 128 × 128, 256 × 256 and 512 × 512, the network modification can adapt to target item identification with more shapes, and the aspect ratio is modified to be 1: 1. 1: 2. 1: and 3, considering the opening and closing size of the mechanical claw, combining the grabbed target objects which are mostly vertical articles, modifying the sizes of anchor points for better attaching the target proportion, and concentrating the model in the 3 proportions for calculation, so that the excessive calculation amount can be reduced, and the training and testing memory is saved. Then eliminating anchor boxes exceeding the size of the original image, and further screening by a non-maximum value suppression (NMS) method to obtain an interested area; (3) and modifying the loss function. In order to further improve the accuracy of the division of the mask, a method for adding edge loss in the mask branches is provided, so that the edges of the division result are more accurate. Mask R-CNN loss function of
L=Lcls+Lbox+Lmask
In the Mask R-CNN segmentation task, LclsAs a function of classification loss, LboxTo detect the loss function, LmaskFor the average binary cross entropy loss function, the deficiency in the segmentation task is dependent on region information, so that the prediction of the boundary is ignored, the accuracy of boundary segmentation on the final segmentation result is not high, the distance precision of a Mask region for subsequently using the target is not high, and the method is implemented at LmaskAdding inAnd a boundary loss function which regularizes the position, shape and continuity of the segmentation by using distance loss so as to make the position, shape and continuity closer to the target boundary. L ismask-edgeThe optimized loss function is
Figure BDA0003365047890000061
Wherein L isedgeFor the boundary loss function, y denotes the annotated target edge,
Figure BDA0003365047890000062
representing the prediction boundary, alpha is a weight coefficient, B is the boundary of the segmentation result, MdistAnd (5) distance transformation of the group-channel segmentation boundary.
Step 2.5, performing model transfer learning training, loading the self-made data set into the optimized network model by using a transfer learning method, and simultaneously loading a model pre-trained by using a COCO data set, so as to improve model convergence, and performing iterative training on the model through parameter optimization to generate a target detection and segmentation model;
and 2.6, detecting and segmenting the target, intercepting an RGB image from a video stream shot by a RealSense D435i binocular camera, transmitting the RGB image into a Mask R-CNN model, identifying the type and the position of a target object to be captured through the model, segmenting the target, and outputting a Mask region of the target.
In step 3, the target positioning based on the RealSenseD435i binocular depth camera specifically comprises the following steps:
step 3.1, acquire RGB image and DEPTH image from RealSenseD435i camera.
And 3.2, aligning the DEPTH image and the RGB image, so that the pixel points in the RGB image correspond to the DEPTH image target points one by one.
And 3.3, filtering the background information of the target image, and filtering the background area of the image according to the Mask area of the target object output in the step 2.6.
And 3.4, calculating a target distance, calculating the distance from a pixel point in a Mask area of the target object to the binocular camera, and calculating the average value of the pixel point distances to obtain the distance from the target object to the camera.
And 3.5, calculating the target width, and calculating the maximum width of the edge of the target object in the DEPTH image corresponding to the Mask region edge pixels as the width of the target object. And judging whether the target is in the grabbing range of the mechanical arm according to the calculated width value.

Claims (8)

1. A robot target positioning and grabbing method based on Mask R-CNN and a binocular camera is characterized by comprising the following steps:
step 1: calibrating the camera;
step 2: identifying and segmenting the target;
and step 3: positioning a target;
and 4, step 4: calculating the pose of the target;
and 5: grabbing the target;
in step 2, the RGB image collected by the binocular structured light camera is used as an input image and sent into a pre-trained convolutional neural network Mask R-CNN model, a detection frame and a Mask of a target object in the image are output through the model, a target area is extracted by carrying out pixel point segmentation on the target object, and background interference information is filtered.
2. The method of claim 1, wherein in step 2, the target object is identified and segmented by using Mask R-CNN model, and the model construction and identification steps are as follows:
2-1) acquiring a target object data set, and acquiring the data set of the target object from different environments, different angles, different brightness and different postures according to the type of the target object to be captured;
2-2) data enhancement, namely, expanding the data set by adopting a method of combining traditional image geometric transformation data enhancement and generation type data enhancement by utilizing GAN (gamma-gamma), and expanding the data set by adopting operations of random cutting, horizontal overturning, image tilting, noise adding and image zooming on the acquired data set through brightness transformation, noise adding, shearing, rotation and the like for the traditional image geometric transformation method;
2-3) labeling the target data set, and labeling the acquired image by using an image labeling tool;
2-4) optimizing a Mask R-CNN network model;
2-5) carrying out model transfer learning training, loading the self-made data set into the optimized network model by using a transfer learning method, and simultaneously loading a model pre-trained by using a COCO data set, so as to improve model convergence, and carrying out iterative training on the model by parameter optimization to generate a target recognition and segmentation model;
2-6) identifying and segmenting the target, intercepting an RGB image from a video stream shot by a binocular camera, transmitting the RGB image into a Mask R-CNN model, identifying the type and the position of a target object to be captured through the model, segmenting the target, and outputting a Mask region of the target.
3. The method according to claim 2, wherein in step 2-4), when performing Mask R-CNN network model optimization, the following steps are adopted:
(1) modifying the feature extraction network, and improving the speed of target identification by reducing network layers;
(2) modifying the RPN area proposed network, modifying the size of an anchor point, concentrating the model in a specified proportion for calculation, eliminating anchor blocks exceeding the size of the original image, and further screening by a non-maximum suppression (NMS) method to obtain an interested area;
(3) modifying the loss function; mask R-CNN loss function of
L=Lcls+Lbox+Lmask
Wherein L isclsAs a function of classification loss, LboxTo detect the loss function, LmaskFor dividing the loss function, at LmaskAdding a boundary loss function, and regularizing the segmented position, shape and continuity by using distance loss to enable the segmented position, shape and continuity to be closer to a target boundary; l ismask-edgeThe optimized loss function is
Figure FDA0003365047880000021
Wherein L isedgeFor the boundary loss function, y denotes the annotated target edge,
Figure FDA0003365047880000022
representing the prediction boundary, alpha is a weight coefficient, B is the boundary of the segmentation result, MdistDistance transformation for the group-treth segmentation boundary.
4. The method according to claim 1, wherein in step 1, calibration of the camera is performed to obtain three-dimensional spatial position information from two-dimensional image information, and the method specifically comprises the following steps:
1-1) making a calibration plate;
1-2) collecting images, changing the position and the angle of a calibration plate relative to a camera, and shooting a plurality of pictures of the calibration plate from different angles, different positions and different postures by using the camera to be calibrated;
1-3) detecting the calibration board angular point to obtain the pixel coordinate value of the calibration board angular point, and calculating the physical coordinate value of the calibration board angular point according to the known size of the checkerboard and the origin of a world coordinate system;
1-4) solving the internal parameters and the external parameters of the camera.
5. The method of claim 1, wherein in step 3, the DEPTH image acquired by the binocular camera is aligned with the RGB image, the average distance between a target Mask area pixel point in the RGB image and an infrared lens of the binocular camera is calculated to obtain the distance of the target object, the maximum width of the edge of the target object in the DEPTH image corresponding to the Mask area edge pixel is calculated as the width of the target object, and the calculated width value is used for judging whether the target is in a mechanical arm clamping range.
6. The method according to claim 5, characterized in that in step 3, it comprises in particular the steps of:
4-1) acquiring an RGB image and a DEPTH image;
4-2) aligning the DEPTH image and the RGB image, so that pixel points in the RGB image correspond to the DEPTH image target points one by one;
4-3) filtering the background information of the target image, and filtering the background area of the image according to the Mask area of the target object output in the step 2-4);
4-4) calculating a target distance, calculating the distance between a pixel point in a Mask area of a target object and an infrared lens of a binocular camera, and solving the average value of the distances as the distance between the target object and the camera lens;
4-5) calculating the target width, calculating the maximum width of the edge of the target object in the DEPTH image corresponding to the Mask region edge pixels as the width of the target object, and judging whether the target is in the grabbing range of the mechanical arm according to the calculated width value.
7. The method according to claim 1, wherein in step 4, when calculating the target pose, a space coordinate system is established with the arm shoulder of the robot arm as the coordinate origin by using a hand-eye calibration method, and coordinate values of the target object in a coordinate system with the camera as the origin are converted into coordinate values in the coordinate system, wherein the hand-eye calibration method is used for calibration, and the conversion relationship between the camera coordinate system and the robot end coordinate system can be determined.
8. The method according to claim 1, wherein in step 5, the robot is driven to reach the area where the target is located, the distance between the robot and the target object is adjusted so that the target object is within a capturable range of a gripper at the end of the robot arm, the angular postures of the joints are solved through inverse kinematics of the robot, and finally the target object is grabbed by controlling the rotation angles of the joints of the robot arm.
CN202111401496.XA 2021-11-19 2021-11-19 Robot target positioning and grabbing method based on Mask R-CNN and binocular camera Pending CN114155301A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111401496.XA CN114155301A (en) 2021-11-19 2021-11-19 Robot target positioning and grabbing method based on Mask R-CNN and binocular camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111401496.XA CN114155301A (en) 2021-11-19 2021-11-19 Robot target positioning and grabbing method based on Mask R-CNN and binocular camera

Publications (1)

Publication Number Publication Date
CN114155301A true CN114155301A (en) 2022-03-08

Family

ID=80457291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111401496.XA Pending CN114155301A (en) 2021-11-19 2021-11-19 Robot target positioning and grabbing method based on Mask R-CNN and binocular camera

Country Status (1)

Country Link
CN (1) CN114155301A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114359394A (en) * 2022-03-17 2022-04-15 季华实验室 Binocular vision positioning method and device, electronic equipment and storage medium
CN114387268A (en) * 2022-03-22 2022-04-22 中国长江三峡集团有限公司 Bolt looseness detection method and device
CN114683301A (en) * 2022-03-22 2022-07-01 四川正狐智慧科技有限公司 Inspection robot for pig farm, robot system and working method of system
CN115816460A (en) * 2022-12-21 2023-03-21 苏州科技大学 Manipulator grabbing method based on deep learning target detection and image segmentation
CN116758136A (en) * 2023-08-21 2023-09-15 杭州蓝芯科技有限公司 Real-time online identification method, system, equipment and medium for cargo volume

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114359394A (en) * 2022-03-17 2022-04-15 季华实验室 Binocular vision positioning method and device, electronic equipment and storage medium
CN114387268A (en) * 2022-03-22 2022-04-22 中国长江三峡集团有限公司 Bolt looseness detection method and device
CN114683301A (en) * 2022-03-22 2022-07-01 四川正狐智慧科技有限公司 Inspection robot for pig farm, robot system and working method of system
CN115816460A (en) * 2022-12-21 2023-03-21 苏州科技大学 Manipulator grabbing method based on deep learning target detection and image segmentation
CN116758136A (en) * 2023-08-21 2023-09-15 杭州蓝芯科技有限公司 Real-time online identification method, system, equipment and medium for cargo volume
CN116758136B (en) * 2023-08-21 2023-11-10 杭州蓝芯科技有限公司 Real-time online identification method, system, equipment and medium for cargo volume

Similar Documents

Publication Publication Date Title
CN114155301A (en) Robot target positioning and grabbing method based on Mask R-CNN and binocular camera
CN108555908B (en) Stacked workpiece posture recognition and pickup method based on RGBD camera
CN107767423B (en) mechanical arm target positioning and grabbing method based on binocular vision
CN112070818B (en) Robot disordered grabbing method and system based on machine vision and storage medium
CN107471218B (en) Binocular vision-based hand-eye coordination method for double-arm robot
CN110580725A (en) Box sorting method and system based on RGB-D camera
CN110211180A (en) A kind of autonomous grasping means of mechanical arm based on deep learning
CN108416428B (en) Robot vision positioning method based on convolutional neural network
CN111199556B (en) Indoor pedestrian detection and tracking method based on camera
CN111862201A (en) Deep learning-based spatial non-cooperative target relative pose estimation method
CN110425996A (en) Workpiece size measurement method based on binocular stereo vision
CN110378325A (en) A kind of object pose recognition methods during robot crawl
CN112560704B (en) Visual identification method and system for multi-feature fusion
CN110733039A (en) Automatic robot driving method based on VFH + and vision auxiliary decision
CN113643280A (en) Plate sorting system and method based on computer vision
CN115816460A (en) Manipulator grabbing method based on deep learning target detection and image segmentation
CN113822810A (en) Method for positioning workpiece in three-dimensional space based on machine vision
CN114750154A (en) Dynamic target identification, positioning and grabbing method for distribution network live working robot
CN114882109A (en) Robot grabbing detection method and system for sheltering and disordered scenes
CN114494463A (en) Robot sorting method and device based on binocular stereoscopic vision technology
Li et al. A mobile robotic arm grasping system with autonomous navigation and object detection
CN115861780B (en) Robot arm detection grabbing method based on YOLO-GGCNN
Gao et al. An automatic assembling system for sealing rings based on machine vision
CN114187312A (en) Target object grabbing method, device, system, storage medium and equipment
CN117021099A (en) Human-computer interaction method oriented to any object and based on deep learning and image processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination