CN114952809B

CN114952809B - Workpiece identification and pose detection method, system and mechanical arm grabbing control method

Info

Publication number: CN114952809B
Application number: CN202210732860.9A
Authority: CN
Inventors: 徐刚; 赵有港; 崔玥; 周翔; 许允款; 曾晶; 肖江剑
Original assignee: Ningbo Institute of Material Technology and Engineering of CAS
Current assignee: Ningbo Institute of Material Technology and Engineering of CAS
Priority date: 2022-06-24
Filing date: 2022-06-24
Publication date: 2023-08-01
Anticipated expiration: 2042-06-24
Also published as: CN114952809A

Abstract

The invention discloses a workpiece identification and pose detection method, a system and a grabbing control method of a mechanical arm. The workpiece identification and pose detection method comprises the following steps: acquiring a 2D image and a 3D point cloud image in a scene to be identified; identifying a target workpiece in the scene to be identified based on the 2D image, and performing instance segmentation based on a mapping relation to obtain a point cloud area corresponding to the target workpiece; and based on a deep learning algorithm, pose detection is carried out in the point cloud area, and pose information of the target workpiece is obtained. The workpiece recognition and pose detection method provided by the invention avoids the difficult problems of cross-mode data feature extraction and matching in the small workpiece scattered stacking grabbing scene, avoids excessively complex data processing calculation, and provides an optimized solution for the workpiece stacking recognition and grabbing application scene in the direction of effectively improving the recognition efficiency and grabbing efficiency by combining the 2D image and the 3D point cloud image.

Description

Workpiece identification and pose detection method, system and mechanical arm grabbing control method

Technical Field

The invention relates to the technical field of image recognition and mechanical control, in particular to a workpiece recognition and pose detection method and system and a grabbing control method of a mechanical arm.

Background

Robot snatchs technique based on two-dimensional/three-dimensional vision has been widely used in simple scenes such as commodity circulation express delivery, warehouse transport, tear open pile up neatly, and vision-guided robot has strengthened the perceptibility that faces the complex environment. In an industrial grabbing scene, a two-dimensional image can provide compact and rich texture information, and the position (two-dimensional coordinates) of a grabbed workpiece can be obtained through image processing and identification, but depth information can not be obtained; the three-dimensional image can provide distance information in the grabbing scene, but cannot obtain abundant detail information, so that grabbing precision is reduced. The two types of data have better complementarity, and the data of the two modes are fused, so that the more comprehensive perception of the workpiece grabbing scene can be realized. In recent years, as research on a 6D attitude estimation algorithm of a workpiece is increasingly increased, the computing capability of equipment is increasingly improved, and a robot grabbing system has creatively broken through in the related fields of disordered and scattered stacking of workpieces, disordered workpiece assembly, flexible grabbing and the like.

The recognition and pose detection of the target object are key preconditions of the robot grabbing task. Since early days of computer vision, target 6D pose detection and estimation was described by a translation vector t e R3 and a rotation matrix R e SO (3) with respect to a fixed coordinate system of a given reference system, which is a long-standing challenge and an open research field.

Because of the diversity of objects in the real world, the potential object symmetry, clutter and occlusion in the scene, and varying illumination conditions, the core steps of the task of target 6D pose detection and estimation are to first obtain the centroid position coordinates (x, y, z) of the target object under the camera coordinate system by various algorithms, then match the model to the centroid position to obtain the rotational pose (R _x ，R _y ，R _z ) Based on hand-eye calibration matrix conversion, the position of the target object under the base coordinates of the mechanical arm is obtained, and finally the mechanical arm is controlled to move to perform grabbing operation, so that the method has certain challenges. From the technical point of view, how the three-dimensional point cloud data and the two-dimensional image belong to data of different modes is skillfully fused, reliable geometric features are analyzed according to the scattered stacking condition of the workpieces in the grabbing scene, and finally the workpieces which can be grabbed in the scene are identifiedAnd pose information is obtained, so that the method is a research direction for the domestic and foreign scientific workers to explore the force.

The existing workpiece recognition and pose detection methods can be divided into the following two types according to different input image data types: based on 2D visual data (input with RGB or RGBD data), based on 3D visual data (input with point cloud data). The simple recognition method based on 2D data is lack of scene depth information, so that only planar object grabbing is often performed, stacked scenes cannot be processed, and therefore the workpiece recognition and pose detection method based on 3D visual data is becoming the mainstream. Detection methods based on 3D visual data can be roughly classified into the following two types according to the difference of implementation principles: template matching method and deep learning method. The first type of template matching method is generally based on PPF (Point Pair Future) algorithm (for example, drop B, ulrich M, navab n.et al model global, match localy: efficient and robust 3D object recognition[C ]. IEEE computer society conference on computer vision and pattern recovery: piscataway: IEEE Press,2010: 998-1005.), and the algorithm is a description method based on point-to-feature, extracts point-to-feature and trains a model of a target according to 3D model data of the target, detects 3D feature points and matches in a target scene based on PPF feature descriptors, obtains an initial estimate of a pose and performs iterative voting, and finally performs a refinishent operation on the result by using ICP algorithm, so as to obtain a more accurate pose result and output the pose result. The template matching method has the greatest defect that the phenomenon of mismatching can occur, however, when the workpiece is too simple and the characteristics are not obvious, the method can often obtain a wrong identification result. The second kind of deep learning method is to make and generate a simulation data set in a simulation scene, learn data characteristics in a network and finally obtain a pose detection result in a test data set. For example, document (Dong Z, liu S, zhou T.et al.PPR-Net: point-wise pose regression network for instance segmentation and 6d pose estimation in bin-picking scenarios [ C ]. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway: IEEE Press, 2019:1773-1780.) proposes a novel Point-wise pose regression network PPR-Net (Point-wise Pose Regression Network). The method is a winner in IROS2019 'bin-pick gesture estimation challenge', takes PointNet++ as a backbone network, carries out 6D gesture estimation on each point in a point cloud of an object instance to which the PointNet++ belongs, and then averages each identified predicted gesture in space based on a clustering method to obtain a final gesture assumption. However, this method has the disadvantages that: the processing efficiency of the 3D point cloud image of the whole scene of the workpiece is low, and the analysis and detection time is long.

Therefore, how to provide a method for identifying and detecting the pose of a target object with higher efficiency, which is suitable for a scene where small workpieces are stacked and blocked, is an urgent problem to be solved.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide a workpiece identification and pose detection method, a system and a grabbing control method of a mechanical arm.

In order to achieve the purpose of the invention, the technical scheme adopted by the invention comprises the following steps:

in a first aspect, the present invention provides a method for workpiece recognition and pose detection, including:

s1, acquiring a 2D image and a 3D point cloud image in a scene to be identified;

s2, identifying a target workpiece in the scene to be identified based on the 2D image, and performing instance segmentation on an area where the target workpiece is located in the 3D point cloud image based on a mapping relation between the 2D image and the 3D point cloud image to obtain a point cloud area corresponding to the target workpiece;

and S3, based on a deep learning algorithm, performing pose detection in the point cloud area, and acquiring pose information of the target workpiece.

In a second aspect, the present invention also provides a workpiece recognition and pose detection system, including:

the image acquisition module is used for acquiring a 2D image and a 3D point cloud image in a scene to be identified;

the region acquisition module is used for identifying a target workpiece in the scene to be identified based on the 2D image, and carrying out instance segmentation on a region where the target workpiece is located in the 3D point cloud image based on a mapping relation between the 2D image and the 3D point cloud image to obtain a point cloud region corresponding to the target workpiece;

and the pose acquisition module is used for carrying out pose detection in the point cloud area based on a deep learning algorithm to acquire pose information of the target workpiece.

In a third aspect, the present invention further provides a method for controlling gripping of a mechanical arm, including:

acquiring target workpieces and pose information of the target workpieces in a scene to be identified based on the workpiece identification and pose detection methods;

and selecting the target workpiece to be grabbed, and controlling the mechanical arm to carry out grabbing action based on the pose information.

Based on the technical scheme, compared with the prior art, the invention has the beneficial effects that:

the workpiece recognition and pose detection method provided by the invention avoids the difficult problems of cross-mode data feature extraction and matching in the small workpiece scattered stacking grabbing scene, avoids excessively complex data processing calculation, and provides an optimized solution for the workpiece stacking recognition and grabbing application scene in the direction of effectively improving the recognition efficiency and grabbing efficiency by combining the 2D image and the 3D point cloud image.

The above description is only an overview of the technical solutions of the present invention, and in order to enable those skilled in the art to more clearly understand the technical means of the present application, the present invention may be implemented according to the content of the specification, the following description is given of the preferred embodiments of the present invention with reference to the accompanying drawings.

Drawings

FIG. 1 is a flow chart of a workpiece recognition and pose detection method according to an exemplary embodiment of the present invention;

FIG. 2 is a partial flow diagram of a workpiece recognition and pose detection method according to an exemplary embodiment of the present invention;

FIG. 3 is a partial flow diagram of a workpiece recognition and pose detection method according to an exemplary embodiment of the present invention;

FIG. 4 is a partial flow diagram of a workpiece recognition and pose detection method according to an exemplary embodiment of the present invention;

FIG. 5 is a schematic diagram of a workpiece recognition and pose detection system according to an exemplary embodiment of the present invention

FIG. 6 is a schematic diagram of a simulation data set generating system according to an exemplary embodiment of the present invention

FIG. 7 is a schematic diagram of a 2D/3D deep learning network according to an exemplary embodiment of the present invention;

fig. 8 is an exemplary diagram of the recognition and detection effect of the workpiece recognition and pose detection method according to an exemplary embodiment of the present invention.

Detailed Description

In view of the shortcomings in the prior art, the inventor of the present invention has long studied and practiced in a large number of ways to propose the technical scheme of the present invention. The technical scheme, the implementation process, the principle and the like are further explained as follows.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as described herein, and therefore the scope of the present invention is not limited to the specific embodiments disclosed below.

Referring to fig. 1 to fig. 4, an embodiment of the present invention provides a method for workpiece recognition and pose detection, which specifically includes steps S1 to S3 as follows:

s1, acquiring a 2D image and a 3D point cloud image in a scene to be identified.

Specifically, in this embodiment, the image information includes a 2D image and a 3D image, and specifically, a 2D visible light camera is used to photograph the workpiece in the stacked scene to obtain a 2D image, and a 3D visible light camera is used to photograph the workpiece in the stacked scene to obtain a 3D image.

Thus, in some embodiments, step Sl may specifically include: and acquiring the 2D image in the scene to be identified by using a 2D camera, and acquiring the 3D image information in the scene to be identified by using a 3D camera.

In some embodiments, the scene to be identified may preferably include a workpiece stacking scene, and further preferably a workpiece random stacking scene. The method is particularly suitable for application scenes of scattered stacking of small workpieces, such as the scenes that some small workpieces are scattered on a tray or in a container irregularly, such as unordered stacking of common semi-finished products on the tray in industrial production, or the scenes that some materials are transported, such as scattered stacking of workpieces in a material frame, such as a spanner.

S2, identifying a target workpiece in the scene to be identified based on the 2D image, and performing instance segmentation on the region where the target workpiece is located in the 3D point cloud image based on the mapping relation between the 2D image and the 3D point cloud image to obtain a point cloud region corresponding to the target workpiece. The method comprises the steps of identifying a target workpiece in a scene based on a 2D image, and performing instance segmentation on the identified target workpiece area by utilizing a mapping relation between the 2D image and a 3D point cloud image.

Specifically, as shown in fig. 2, the step S2 may include steps S21 to S24:

s21, collecting a plurality of 2D images, marking the outline sealing area of the workpiece, and manufacturing a training data set.

Specifically, in this embodiment, the contour data of the workpiece may be obtained from the 2D image, the contours of the same upper surface of the workpiece are marked as the same class, the positions of the marked workpiece areas in the global image are recorded, a plurality of images are collected and marked, and the training dataset is manufactured. The labeling can be performed manually, or by means of manual labeling, machine labeling in machine countermeasure learning, and the like.

S22, building a deep learning 2D image target segmentation network model, and training the network model based on the training data set.

Specifically, in this embodiment, a 2D image target segmentation network model based on a Mask R-CNN convolutional neural network may be built, and training learning may be performed on the network model based on the fabricated training data set.

S23, importing the 2D image acquired in real time into the trained network model for recognition, and acquiring the region position of the recognition workpiece.

Specifically, in this embodiment, the 2D image target segmentation network model that has been learned may be used to import the 2D image acquired in real time into the network model, and Mask R-CNN is used to implement pixel-level segmentation of the object identified in the 2D image, so as to obtain the location of the region in the segmented 2D image.

And S24, mapping the identified 2D workpiece area position to a 3D point cloud scene area, and performing instance segmentation on a target workpiece area in the 3D point cloud scene.

Specifically, in this embodiment, the mapping relationship between the 2D image and the 3D point cloud may be used to perform point cloud instance segmentation on the same location area in the 3D point cloud data scene based on the target workpiece area after the instance segmentation, so as to obtain a point cloud set for identifying the workpiece.

Thus, in some embodiments, step S2 specifically comprises the steps of:

and importing the 2D image into a target segmentation model for recognition, and acquiring the region position of the target workpiece.

Mapping the region position to a corresponding region in a 3D point cloud image, and performing instance segmentation on the region where the target workpiece is located in the 3D point cloud image.

In some embodiments, the training method of the target segmentation model specifically includes the following steps:

a 2D training dataset is provided, the 2D training dataset comprising a plurality of 2D images for training and corresponding marker information thereof, the marker information being indicative of at least a contour closed region of a workpiece in the 2D images.

And constructing a target segmentation initial model, and training the target segmentation initial model based on the 2D training data set to obtain the target segmentation model.

Specifically, in this embodiment, the 2D image may be segmented and mapped to a 3D point cloud, so as to obtain a point cloud set for identifying the workpiece, and the pose of the workpiece may be detected based on the PPR-Net deep learning network that is built and learned through training, so as to obtain pose information of the workpiece.

With continued reference to fig. 3, step S3 may specifically include the following steps S31-S34:

s31, constructing a deep learning training simulation data set generation system based on V-REP.

Specifically, in this embodiment, a deep learning training simulation data set generating system may be built based on V-REP simulation software, where the built simulation system is shown in fig. 6, where the simulation data set generating system includes building a Kinect simulation visual sensor, importing a workpiece 3D model, importing a material frame 3D model, writing a workpiece drop and image data acquisition program, and the like.

S32, manufacturing and generating a simulation 3D training data set.

And S33, building a deep learning 3D pose detection neural network model, and training the network model based on the training data set.

Specifically, in this embodiment, a 3D pose detection neural network model based on a PPR-Net deep learning network may be built, and training and learning may be performed on the network model based on the fabricated simulation training data set.

S34, importing the 3D point cloud image after the example segmentation into the trained network model to detect the pose of the workpiece, and acquiring the pose information of the workpiece.

Specifically, as shown in fig. 4, in the present embodiment, step S32 may include the following steps S321 to S326:

s321, setting that n workpieces exist in a scene.

In a preferred embodiment, the number of workpieces may be, for example, n=27.

S322, i pieces of work are randomly dropped from a certain position (the initial integer i=0 is set) in the work area based on the domain randomization idea, and different pieces of work are given different color information.

S323, acquiring and storing depth images and rgb images in the scene based on the simulation vision sensor.

In a preferred embodiment, the simulated vision sensor may be a Kinect depth camera.

S324, recording and storing pose information of each workpiece falling acquired in the V-REP.

S325, carrying out visualization degree analysis on each workpiece based on the collected rgb image information, and recording the visualization degree data.

In a preferred embodiment, the visualization degree analysis method may be: introducing the visibility degree v E0, 1 of the work piece]This parameter reflects the degree of occlusion of the predicted object, completely invisible when v=0, completely non-occluded when v=1, and so on. The visual degree of a certain workpiece in a scene is as follows: v=n/N _max 。

Wherein N is the area value of a color area of a workpiece in a certain example, N _max The color area value is the largest value in the whole area workpiece.

S326, if the integer i is less than n, repeating the steps S322 to S326; and if the integer i=n, stopping dropping the workpiece, and manufacturing and generating the data set label file.

More specifically, in this embodiment, the 2D image is segmented and mapped to the 3D point cloud, so as to obtain a point cloud set for identifying the workpiece, and the pose of the workpiece is detected based on the PPR-Net deep learning network which is built and trained, so as to obtain pose information of the workpiece, and a schematic diagram of a 2D/3D deep learning network structure based on the Mask R-CNN convolutional neural network and the PPR-Net deep learning network is shown in fig. 7, and a schematic diagram of a 2D image detection example segmentation effect and a schematic diagram of a 3D point cloud pose detection effect are shown in fig. 8.

As shown in fig. 5, a deep learning training simulation data set generating system disclosed in the present invention may include:

and the image acquisition device is used for acquiring the image information of the workpieces in the stacking scene.

The image acquisition device comprises a 2D image acquisition unit and a 3D image acquisition unit, wherein the 2D image acquisition unit is used for taking a picture of a workpiece in a stacked scene by adopting a 2D camera to obtain a 2D image. The 3D image acquisition unit is used for photographing the workpieces in the stacked scene by adopting a 3D camera to obtain a 3D image.

And the area limiting device is used for physically limiting the falling range of the workpiece according to the field of view of the image acquisition device.

The region limiting device mainly comprises a material frame setting unit and a camera setting unit, wherein the material frame setting unit is used for drawing and importing a material frame 3D model, adjusting a proper position, facilitating physical limitation of a falling range of a workpiece, and the camera setting unit is used for adjusting internal parameters and external parameters of a camera, ensuring consistency with camera parameters of a real scene, and generating an effective data set.

And the workpiece pose acquisition device is used for recording the current workpiece pose information after the random falling action of the workpiece is completed.

And the workpiece visual degree analysis device is used for calculating and analyzing the visual degree of each workpiece in the simulation scene.

The workpiece visual degree analysis device mainly comprises a color pixel acquisition unit and a visual degree calculation unit, wherein the color pixel acquisition unit is used for counting the area value of each divided example color pixel in a scene, and the visual degree calculation unit is used for calculating the visual degree by using the statistics value provided by the color pixel acquisition unit and outputting the visual degree value.

The data set label integrating device is used for integrating the information such as image information, workpiece pose information, visualization degree and the like and manufacturing the data set label.

Thus, in some embodiments, step S3 may specifically comprise the steps of:

and importing the point cloud area into a pose detection model to detect the pose of the target workpiece, and acquiring pose information of the target workpiece.

In some embodiments, the training method of the pose detection model may include the steps of:

the method comprises the steps of providing a 3D training data set, wherein the 3D training data set at least comprises a 3D training image, a corresponding workpiece pose label and a visualization degree label.

And constructing a pose detection initial model, and training the pose detection initial model based on the 3D training data set to obtain the pose detection model.

In some embodiments, the 3D training data set is generated by simulation of a simulation data set generation system.

In some embodiments, the simulation generation may specifically include the steps of:

and constructing a simulation scene, and setting n virtual workpieces in the simulation scene.

And (3) based on a domain randomization method, randomly dropping i virtual workpieces from a selected position in a working area of the simulation scene, and endowing different virtual workpieces with different color information, wherein i is iterated from zero increment. For example, sequentially from 0 to +1.

Depth images and rgb images in the scene are acquired and saved based on the simulated vision sensor.

And recording and storing pose information of each virtual workpiece falling in the simulation scene as the workpiece pose label.

And carrying out visualization degree analysis on each virtual workpiece based on the acquired depth image and/or the rgb image, and recording the visualization degree data as the visualization degree label.

And stopping dropping the virtual workpiece when iterating to the integer i not smaller than n, and generating the 3D training data set based on the workpiece pose label and the visualization degree label.

In some embodiments, the simulation dataset generation system may include:

and the image acquisition device is used for acquiring the 3D training image of the virtual workpiece in the simulation scene.

And the area limiting device is used for limiting the falling range of the virtual workpiece according to the visual field size acquired by the 3D training image.

And the workpiece pose acquisition device is used for recording pose information of the target workpiece at the moment after the random dropping action of the virtual workpiece is completed.

And the workpiece visual degree analysis device is used for calculating and analyzing the visual degree of each virtual workpiece in the simulation scene.

The data set label integrating device is used for integrating the 3D training image, the workpiece pose label and the visualization degree label to generate the 3D training data set.

Based on the above method, another embodiment of the present invention further provides a workpiece recognition and pose detection system, including:

and the image acquisition module is used for acquiring the 2D image and the 3D point cloud image in the scene to be identified.

The region acquisition module is used for identifying a target workpiece in the scene to be identified based on the 2D image, and carrying out instance segmentation on a region where the target workpiece is located in the 3D point cloud image based on a mapping relation between the 2D image and the 3D point cloud image to obtain a point cloud region corresponding to the target workpiece.

Similarly, the embodiment of the invention also provides an electronic device which can be applied to the system, and the electronic device comprises a processor and a memory, wherein the memory stores a computer program, and the computer program executes the steps of the workpiece identification and pose detection method.

Meanwhile, the embodiment of the invention also provides a readable storage medium, in which a computer program is stored, the computer program executing the steps of the workpiece identification and pose detection method.

The above embodiment provides a method and a system for workpiece recognition and pose detection, and a simulation data set generation system applied to the method and the system, and as a further application of the method and the system, another embodiment of the invention also provides a method for controlling grabbing of a mechanical arm, which includes the following steps:

and acquiring the target workpiece and the pose information thereof in the scene to be identified based on the workpiece identification and pose detection method in any embodiment.

Namely: and obtaining the pose information of the workpiece, and controlling the mechanical arm to carry out grabbing operation.

It should be noted that, how to obtain pose information of a workpiece efficiently and accurately, how to plan a path and/or an action of a mechanical arm according to the pose information is not an important point of the present invention, and related technical solutions are already known in a plurality of existing technologies, so that a person skilled in the art can perform combination or adaptive research and development without any obstacle, and it can be understood that no matter how to combine a specific mechanical arm control method based on the workpiece recognition and pose detection method provided by the present invention, the present invention falls within the protection scope of the present invention.

According to the workpiece identification and pose detection method and the deep learning training simulation data set generation system, disclosed by the invention, in a small workpiece scattered stacking grabbing scene, the difficult problems of cross-mode data feature extraction and matching are avoided, and meanwhile, the excessively complex data processing calculation is avoided. By combining the 2D image and the 3D point cloud data, an optimized solution is provided for the application scene of workpiece stacking recognition and grabbing in the direction of effectively improving recognition efficiency and grabbing efficiency, meanwhile, manual labeling of conventional samples is avoided, a training data set is automatically manufactured and generated, and working efficiency is greatly improved.

While the invention has been described with reference to an illustrative embodiment, it will be understood by those skilled in the art that various other changes, omissions and/or additions may be made and substantial equivalents may be substituted for elements thereof without departing from the spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed for carrying out this invention, but that the invention will include all embodiments falling within the scope of the appended claims.

It should be understood that the above embodiments are merely for illustrating the technical concept and features of the present invention, and are intended to enable those skilled in the art to understand the present invention and implement the same according to the present invention without limiting the scope of the present invention. All equivalent changes or modifications made in accordance with the spirit of the present invention should be construed to be included in the scope of the present invention.

Claims

1. The workpiece recognition and pose detection method is characterized by comprising the following steps of:

s1, acquiring a 2D image in a scene to be identified by using a 2D camera, and acquiring a 3D point cloud image in the scene to be identified by using a 3D camera; the scene to be identified comprises a workpiece unordered stacking scene;

s2, importing the 2D image into a target segmentation model for recognition, and acquiring the region position of the target workpiece; mapping the region position to a corresponding region in a 3D point cloud image, and performing instance segmentation on a region where a target workpiece is located in the 3D point cloud image to obtain a point cloud region corresponding to the target workpiece;

s3, based on a deep learning algorithm, the point cloud area is imported into a pose detection model to detect the pose of the target workpiece, and pose information of the target workpiece is obtained;

the training method of the pose detection model comprises the following steps:

providing a 3D training data set, wherein the 3D training data set at least comprises a 3D training image, a corresponding workpiece pose label and a visualization degree label;

constructing a pose detection initial model, and training the pose detection initial model based on the 3D training data set to obtain the pose detection model;

wherein the 3D training data set is generated by simulation of a simulation data set generation system;

the simulation generation specifically comprises the following steps:

constructing a simulation scene, and setting n virtual workpieces in the simulation scene;

based on a domain randomization method, randomly dropping i virtual workpieces from a selected position in a working area of the simulation scene, and endowing different virtual workpieces with different color information;

acquiring and saving a depth image and an rgb image in a scene based on a simulation vision sensor;

recording and storing pose information of each virtual workpiece falling in the simulation scene as the workpiece pose label;

based on the depth image and the rgb image, carrying out visualization degree analysis on each virtual workpiece, and recording visualization degree data as the visualization degree label, wherein the visualization degree data v epsilon [0,1] reflects the shielding degree of the virtual workpiece, is completely invisible when v=0, and is completely non-shielding when v=1;

2. The method for workpiece recognition and pose detection according to claim 1, wherein the training method of the target segmentation model comprises:

providing a 2D training dataset comprising a plurality of 2D images for training and corresponding marker information thereof, the marker information being indicative of at least a silhouette-enclosed region of a workpiece in the 2D images;

3. The method of workpiece recognition and pose detection according to claim 1, wherein said simulation dataset generation system comprises:

the image acquisition device is used for acquiring a 3D training image of the virtual workpiece in the simulation scene;

the region limiting device is used for obtaining the size of the visual field according to the 3D training image and limiting the falling range of the virtual workpiece;

the workpiece pose acquisition device is used for recording pose information of the target workpiece at the moment after the random dropping action of the virtual workpiece is completed;

the workpiece visual degree analysis device is used for calculating and analyzing the visual degree of each virtual workpiece in the simulation scene;

4. A workpiece recognition and pose detection system for implementing the workpiece recognition and pose detection method according to any of claims 1-3, characterized by comprising:

the image acquisition module is used for acquiring the 2D image in a scene to be identified by using a 2D camera, and acquiring the 3D image information in the scene to be identified by using a 3D camera; the scene to be identified comprises a workpiece unordered stacking scene;

the region acquisition module is used for importing the 2D image into a target segmentation model for recognition and acquiring the region position of the target workpiece; mapping the region position to a corresponding region in a 3D point cloud image, and performing instance segmentation on a region where a target workpiece is located in the 3D point cloud image to obtain a point cloud region corresponding to the target workpiece;

5. The grabbing control method of the mechanical arm is characterized by comprising the following steps of:

acquiring a target workpiece and pose information thereof in a scene to be identified based on the workpiece identification and pose detection method according to any one of claims 1-3;