CN112489117B - Robot grabbing pose detection method based on domain migration under single-view-point cloud - Google Patents

Robot grabbing pose detection method based on domain migration under single-view-point cloud Download PDF

Info

Publication number
CN112489117B
CN112489117B CN202011418811.5A CN202011418811A CN112489117B CN 112489117 B CN112489117 B CN 112489117B CN 202011418811 A CN202011418811 A CN 202011418811A CN 112489117 B CN112489117 B CN 112489117B
Authority
CN
China
Prior art keywords
point cloud
pose
grabbing
domain
view
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011418811.5A
Other languages
Chinese (zh)
Other versions
CN112489117A (en
Inventor
钱堃
景星烁
柏纪伸
赵永强
施克勤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202011418811.5A priority Critical patent/CN112489117B/en
Publication of CN112489117A publication Critical patent/CN112489117A/en
Application granted granted Critical
Publication of CN112489117B publication Critical patent/CN112489117B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Abstract

The invention discloses a robot grabbing pose detection method based on domain migration under single-view-point cloud, which comprises the following steps of: 1) Acquiring single-view point cloud of a scene grabbed by the robot through a depth camera; 2) Preprocessing the collected point cloud; 3) Uniformly and randomly sampling on the target point cloud, calculating a local frame, and acquiring candidate grabbing poses; 4) Defining a new coordinate system by the center of the gripper, and encoding the grabbing pose into a multi-channel projection image; 5) Constructing a capture pose evaluation model which takes a multi-channel capture image as input and realizes unsupervised domain self-adaptive migration from a simulation domain to a physical domain based on a generated countermeasure network; 6) And constructing a large-scale simulation object data set, constructing a real object data set, and automatically labeling a rubbing and grabbing detection method to form a training set and a test set. The method relieves the cost of data acquisition and marking in an unsupervised domain migration mode, and has generalization performance on unknown and irregular objects.

Description

Robot grabbing pose detection method based on domain migration under single-view-point cloud
Technical Field
The invention relates to the field of grabbing detection in robot operation skill learning, in particular to a robot grabbing pose detection method based on domain migration under single-view-point cloud.
Background
With the development of artificial intelligence, a dramatic progress has also occurred in the field of robotics. At the present stage, the intelligent robot is expected to have the capabilities of sensing the environment and interacting with the environment, in the robot operation skill learning, the robot environment sensing is also an indispensable ring, and the sensing capability given to the robot is also a long-term target of computer vision and robot subjects. In the operation skills of the robot, the gripping skills can bring great effects to the society, such as completing the picking and placing tasks with heavy human labor, helping the disabled or the old to complete the daily grabbing tasks, and the like, so the robot is the most basic and the most important skill. The key technology lies in detection of a robot grabbing pose. For the detection of the 6DoF grabbing pose, the traditional grabbing pose estimation method needs to calculate the pose of an object in a scene in advance and determine the grabbing pose according to the CAD model of the known object in a model library, and the object in an actual scene is usually unknown and an accurate CAD model is not easy to acquire. At present, the mainstream detection method is based on two-stage grab detection, namely, candidate grab poses are generated firstly and then evaluated. Mainly embodied in point cloud-based deep learning methods, such methods still face difficulties and challenges: the point cloud data acquired by the sensor contains single-view-point clouds which are relatively noisy and mostly incomplete, and a capture detection algorithm is difficult to generalize; the data sets participating in training are obtained through a complicated reconstruction method, some scenes are difficult to acquire, the data volume is limited, large-scale construction and labeling are difficult, and large-scale data sets generated in a simulation environment are not fully utilized.
A typical example of the conventional model matching-based method is the ROS grasping framework proposed by Chitta et al (see "Chitta S, jones E, ciocaliem and Hsiao K, perspective, planning, and execution for mobile interaction in unstructured interactions, IEEE Robotics and Automation Magazine 2012"), which registers a CAD model of a known object onto a point cloud and then plans a feasible grasping route. Based on the most classical algorithm proposed by Pas et al (see "ten Pat A and Platt R, using geometry to detect shots in 3d points groups, proceedings of the International symposium on Robotics Research 2015"), a series of shots candidates are first sampled by geometric constraints, then the projected image is encoded by Using HOG features, and a support vector machine is used for the evaluation of the shots. Subsequently, pas et al perform optimization (see "ten Pas a, gualtieri M, saenko K, platt R, grassp position detection in point clouds, IJRR 2017"), replace the classifier portion of the pose projection image with a deep learning method, and improve the performance of grab detection. Some subsequent researches are based on the method, but sample labeling and domain migration problems in the algorithm based on the point cloud information are not considered.
At present, a robot grabbing pose detection scheme aiming at domain migration is still lacking, a few of migration methods used in the field of robots also exist in a detection framework of a two-dimensional image, and some migration methods also use a domain enhancement strategy. In the field of computer vision, the transfer learning develops rapidly, and particularly in the current deep learning hot tide, the domain self-adaptation of features is mainly focused. Therefore, the migration problem in the grabbing detection is solved by means of the advanced domain self-adaptive technology in the field, the cost of sample collection and labeling can be greatly reduced, cheap training data in a simulation environment is fully utilized, and the generalization performance of the model is improved.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the problems in the prior art in the field, the invention provides a robot grabbing pose detection method based on domain migration under single-view-point cloud, which can realize unsupervised feature self-adaptation from a simulation domain to a real object domain, thereby reducing the cost of sample labeling, fully utilizing simulation data and improving the generalization performance of a model to a new object.
The technical scheme is as follows: the invention adopts the following technical scheme:
a robot grabbing pose detection method based on domain migration under single-view-point cloud comprises the following steps:
step 1, acquiring an original point cloud of a desktop grabbing scene through a depth camera;
step 2, preprocessing the original scene point cloud, wherein the processing flow comprises space cutting, plane extraction and outlier filtering to obtain a target point cloud set of the object to be grabbed;
and 3, calculating candidate grabbing poses according to the target point cloud, firstly, carrying out uniform random sampling in the target point cloud set, calculating Darbour local frame in a neighborhood by taking each sampling point as a center to construct a basic pose, then, carrying out search expansion in a two-dimensional grid through rotation and translation, calculating the deepest grabbing pose, judging the effectiveness, and obtaining the final candidate grabbing pose.
And 4, encoding the candidate grabbing poses, converting a point cloud coordinate system to the center of the holder, and encoding the grabbing poses into a multi-channel image in a projection mode.
And 5, constructing a domain migration grabbing pose evaluation model by using a deep learning framework, labeling the simulation data set and the real object data set in a force closure and collision detection mode, and training the model, so as to predict the probability of the multi-channel image.
And 6, sequencing according to the predicted probabilities of all candidate grabbing poses, and selecting the pose with the highest probability as the pose for the robot to grab.
The scene data in the step 1 is point cloud data acquired under a single visual angle fixed by a depth camera, and the scene type is the grabbing of a single object on a desktop by a robot.
The spatial clipping operation is to clip the desktop part and the part of the object except the point cloud, the plane extraction operation is to remove the point cloud of the desktop part, and the outlier filtering is to remove the outlier left after clipping and plane extraction.
The uniform random sampling in step 3 is to use a congruence method to generate random numbers to obtain point cloud indexes.
And 3, calculating the neighborhood of the Darbour local frame in the neighborhood by taking each sampling point as the center to construct the basic pose, wherein the neighborhood is represented as an intra-sphere region with the radius of r.
The candidate grabbing pose acquiring process in the step 3 comprises the following steps:
(1) Randomly and uniformly sampling on the target point cloud to obtain a sampling point set;
(2) Calculating a basic frame in the neighborhood of each sampling point to obtain a basic pose;
(3) Carrying out pose expansion by rotating and translating a basic pose;
(4) Performing collision detection on each expanded pose, and calculating the deepest grabbing pose;
(5) And eliminating invalid grabbing candidates by judging whether the clamping area corresponding to the deepest grabbing pose contains point clouds or not.
And 5, the simulation data set is a 3D-NET data set, and the real object data set is obtained through three-dimensional reconstruction of a Kinectfusion algorithm.
The grasping pose evaluation model of the domain migration in the step 5 comprises a weight self-adaptive basic convolution network, a lightweight fully-connected layer and a generation countermeasure network, the optimization mode during model training adopts an alternate optimization mode, and the training step comprises the following steps:
(1) The input data comprises a simulation data set and a real object data set, and forward calculation is carried out on all the networks;
(2) Updating a discriminator D, calculating the cross entropy loss of the simulation data set, the domain discrimination loss of the simulation data, the domain discrimination loss of the simulation generated data and the domain discrimination loss of the real object generated data, and adding a gradient penalty item to inhibit mode collapse;
(3) Updating a generator G, and using domain discrimination loss and label classification cross entropy loss after the simulation data set passes through the generator and the discriminator;
(4) Updating a classifier C, and only using the classified cross entropy loss of the simulation data set after passing through the F network and the C network;
(5) And updating the feature extractor F, and using the simulation classification loss, the label classification cross entropy loss of the simulation generated data and the domain discrimination loss of the real object generated data.
The penalty function used to optimize arbiter D is:
Figure BDA0002821342630000041
wherein N represents the number of samples of the batch data, D c And D rf Respectively representing a discriminator label classification branch and a domain discrimination branch, s n In order to simulate the sample, the sample is,
Figure BDA0002821342630000042
and
Figure BDA0002821342630000043
respectively representing simulation feature combination input and physical feature combination input of the generator;
the penalty function used to optimize generator D is:
Figure BDA0002821342630000044
the penalty function for the optimized classifier C is:
Figure BDA0002821342630000045
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0002821342630000046
a simulated feature input representing a classifier;
the loss function of the optimization feature extractor F is:
Figure BDA0002821342630000047
where α and γ represent the balance weights of the two loss portions, respectively.
In the feature extraction network, an SE structure is added, so that the weight can be self-adapted, the model performance is improved, the number of full connection layers is reduced through multilayer convolution operation, and the over-fitting problem can be effectively inhibited.
Has the advantages that: compared with the prior art, the method for detecting the grabbing pose of the robot based on the domain migration under the single-view-point cloud has the following beneficial effects:
1. the method has the advantages that the single-view point cloud is acquired by using the RGB-D sensor with the fixed view angle, the installation difficulty of the sensor is simplified, the cost is reduced, the grabbing pose of 6DoF can be detected by the algorithm, the grabbing task requirement of the three-dimensional space robot is met, and the method has higher practicability compared with plane grabbing.
2. The method has the advantages that the required input is single-view-point cloud, the algorithm can deal with incomplete noisy data, candidate capture generation is simple and effective, and capture pose is encoded into a multi-channel image, so that a deep learning model can be effectively utilized to carry out capture stability prediction.
3. Aiming at the problem that a real object domain sample is difficult to collect, the method adopts a domain self-adaptive technology in the transfer learning, simultaneously trains a model on a simulation domain data set and a real object domain data set, does not need to be labeled, performs the counterstudy by means of a generated counternetwork to extract the domain self-adaptive characteristics, and realizes the capture stability prediction of the real object domain data. Massive simulation data set information can be effectively utilized, a physical domain does not need to be labeled, complex scanning reconstruction tasks are replaced, the generalization capability is improved, and the method is economical and practical.
4. The self-adaptive weight method is used in the domain self-adaptive grabbing detection model to adjust the result obtained by the convolutional layer, the number of full connection layers is reduced, and the overfitting problem during model training is effectively inhibited.
Drawings
FIG. 1 is an overall flow chart of the disclosed method;
FIG. 2 is a schematic diagram of point cloud pre-processing;
FIG. 3 is a schematic diagram of candidate grabbing pose encoding;
FIG. 4 is an architecture diagram of a capture detection model for domain migration;
FIG. 5 is a capture pose and a corresponding three-channel captured image;
FIG. 6 is a class simulation image generated by a generator in a domain migration grab detection model.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described below with reference to the accompanying drawings.
As shown in fig. 1, which is an overall flowchart in the method disclosed by the present invention, the present invention discloses a method for detecting a robot grabbing pose based on domain migration under a single-view-point cloud, which mainly comprises six steps: step 1, acquiring a point cloud image in a captured scene, and acquiring a fixed and single view angle; step 2, point cloud preprocessing, namely extracting the point cloud of the target object through space cutting, plane extraction and outlier filtering; step 3, generating candidate grabbing poses, namely uniformly and randomly sampling on a target point cloud set, establishing a sphere field calculation basic frame by using sampling points, expanding the basic poses by rotating and translating the basic poses, and calculating the deepest grabbing point of each expanded pose to obtain the candidate grabbing poses; step 4, encoding each candidate pose into a multi-channel image in a projection mode, wherein a coordinate system is changed from an original point cloud coordinate system to the center of a holder; step 5, constructing a capture evaluation model of domain migration, simultaneously training on a simulation domain with a label and a real object domain data set without a label, and directly applying the trained model to a real object domain multi-channel image to obtain capture confidence prediction; and 6, screening the pose corresponding to the image with the highest confidence coefficient as the final robot execution pose.
The implementation of the invention needs to use an RGB-D depth sensor and a GPU, and the specific implementation process adopts one desktop with a Geforce 2080GPU and one Kinect V1 depth camera.
The method disclosed by the invention specifically comprises the following steps:
step 1, point cloud data under a single object desktop grabbing scene are obtained;
and (3) acquiring point clouds of the captured scene by using an RGB-D depth camera, and setting the point clouds to be acquired under a single fixed visual angle, wherein only point cloud information is required to be utilized in the method.
Step 2, point cloud pretreatment;
as shown in fig. 2, the whole scene point cloud is first clipped to remove the surrounding cluttered scene, then the desktop point cloud information is removed by plane extraction, and finally the outlier filtering is performed to remove the outlier.
Step 3, generating candidate grabbing poses;
a schematic diagram of the gripper coordinate system is shown in fig. 3. And uniformly and randomly sampling on the target point cloud set, estimating a basic frame by taking a sampling point as a center, expanding by rotating and translating, and calculating the deepest grabbing so as to obtain the final candidate grabbing pose.
The step 3 specifically comprises the following 5 sub-steps, and the specific implementation method is as follows:
(311) Uniformly and randomly sampling by randomly generating point cloud index numbers on the target point cloud, wherein the sampling quantity is fixed;
(312) Establishing a sphere radius neighborhood for each sampling point, counting all points in a sphere, and calculating a basic frame (Darbour frame), wherein the calculation formula is as follows:
Figure BDA0002821342630000061
wherein p represents sampling points, n (-) and n (-) respectively T Respectively representing the surface normal vector and its transpose of a point in the neighborhood,
Figure BDA0002821342630000062
representing the target point cloud, B r (p) represents a sphere region having a radius r around a sample point, and the normal vector v of the sample point can be estimated by calculating the eigenvector corresponding to the minimum eigenvalue of M (p) 1 (p) calculating the eigenvector corresponding to the largest eigenvalue can estimate the minimum principal curvature direction v of the point 3 (p) the direction of maximum principal curvature v can be calculated by vector orthogonality 2 (p), the basic frame F (p) = [ v) can be obtained finally 1 (p),v 2 (p),v 3 (p)]I.e. the initial pose of the gripper, and the initial position of the gripper is the coordinates of the sampling point p, so the initial pose h (p) = [ p ] of the gripper can be obtained x ,p y ,p z ,v 1 (p),v 2 (p),v 3 (p)];
(313) Expanding the pose of each initial gripper through rotation around a z-axis and translation along a Y-axis, and realizing the pose h of each expanded gripper in a mode of constructing a two-dimensional grid (phi, Y) and searching in the grid φ,y (p) all are realized by right multiplication of the initial gripper pose by a transfer matrix;
(314) Moving each extended pose in the positive direction of x to find the minimum x meeting the collision-free condition * To obtain a new pose
Figure BDA0002821342630000072
(315) And eliminating invalid poses by judging whether the holder area corresponding to each new pose contains point clouds or not, thereby obtaining all candidate poses.
Step 4, as shown in fig. 4, encoding each candidate grabbing pose in the following encoding modes: and extracting the point cloud in the closed area of the holder corresponding to each pose, converting a coordinate system by taking the current grabbing pose as a reference, and projecting in a plurality of axial directions so as to encode into a multi-channel image.
Step 5, as shown in fig. 5, a capture pose evaluation model based on domain migration is constructed, and it can be seen that, in the training stage, domain adaptive features are extracted by using simulation domain data and physical domain data in combination with a generation countermeasure network, and the idea is mainly to constrain the features extracted by the two domains and align generated images (as shown in fig. 6) obtained as input by a generator to the simulation domain, so that the two domains are mapped to the same distribution. In the testing stage, only an F network and a C network are needed, the method can be used for testing in the physical domain, multi-channel captured images in the physical domain are used as input, probability prediction is carried out through a captured pose evaluation model based on domain migration, and confidence coefficient that each candidate pose can be successfully captured is obtained.
In the step 5, two training data sets are mainly provided, one is an object set constructed from a 3D-NET model library and used as a simulation domain, and the other is an object set obtained through three-dimensional reconstruction of a Kinectfusion algorithm and used as a real domain. And similarly, sampling candidate and coding and other operations are carried out, and rubbing is marked by means of force closure principle detection.
In addition, the optimization mode during model training adopts an alternate optimization mode, and the training process comprises the following substeps:
(511) Inputting data comprising a simulation data set and a real object data set, and then carrying out forward calculation on the network;
(512) Updating a discriminator D, calculating the cross entropy loss and the domain discrimination loss of the simulation data set and the domain confrontation loss of the real domain, and adding a gradient penalty term to inhibit mode collapse, wherein the optimized loss term is as follows:
Figure BDA0002821342630000071
wherein N represents the number of samples of the batch data, D c And D rf Respectively representing a discriminator tag classification branch and a domain discrimination branch, s n To simulate a sample, F csn And F ctn Respectively representing simulation feature merging input and real domain feature merging input of a generator;
(513) And updating the generator G, and using the domain discrimination loss and the label classification loss of the simulation data set after the generator and the discriminator, wherein the optimized loss items are as follows:
Figure BDA0002821342630000081
(514) Updating a classifier C, only using classification loss of the simulation data set after passing through the F network and the C network, wherein the optimized loss item is as follows:
Figure BDA0002821342630000082
wherein the content of the first and second substances,
Figure BDA0002821342630000083
a simulated feature input representing a classifier;
(515) Updating a feature extractor F, using the simulation classification loss and the domain confrontation loss of the real domain generated data, adopting a multi-loss weighting mode as a loss item optimization mode, wherein the loss item is as follows:
Figure BDA0002821342630000084
where α and γ represent the balance weights of the two loss portions, respectively.
In addition, an SE structure is added in the feature extraction network, so that the weight can be self-adapted, the model performance is improved, the number of full connection layers is reduced through multilayer convolution operation, and the over-fitting problem can be effectively inhibited.
And 6, sequencing according to the probability predicted by the model in the step 5, screening out the pose corresponding to the captured image with the highest probability of stably capturing the object, and taking the pose as the capturing execution pose of the robot.

Claims (9)

1. A robot grabbing pose detection method based on domain migration under single-view-point cloud is characterized by comprising the following steps:
step 1, acquiring an original point cloud of a desktop grabbing scene through a depth camera;
step 2, preprocessing the original scene point cloud, wherein the processing flow comprises space cutting, plane extraction and outlier filtering to obtain a target point cloud set of the object to be grabbed;
and 3, calculating candidate grabbing poses according to the target point cloud, firstly, carrying out uniform random sampling in the target point cloud set, calculating Darbour local frame in a neighborhood by taking each sampling point as a center to construct a basic pose, then, carrying out search expansion in a two-dimensional grid through rotation and translation, calculating the deepest grabbing pose, judging the effectiveness, and obtaining the final candidate grabbing pose.
And 4, encoding the candidate grabbing poses, converting a point cloud coordinate system to the center of the gripper, and encoding the grabbing poses into a multi-channel image in a projection mode.
And 5, constructing a capture pose evaluation model of domain migration by using a deep learning framework, labeling the simulation data set and the physical data set in a force closure and collision detection mode, and training the model, so as to predict the probability of the multi-channel image.
And 6, sequencing according to the predicted probabilities of all candidate grabbing poses, and selecting the pose with the highest probability as the pose for the robot to grab.
2. The method for detecting the grabbing pose of the robot based on the domain migration under the single-view-point cloud according to claim 1, wherein the scene data in the step 1 is point cloud data acquired under a single view angle fixed by a depth camera, and the scene type is grabbing of a single object on a desktop by the robot.
3. The method for detecting the grabbing pose of the robot based on the domain migration under the single-view-point cloud as claimed in claim 1, wherein the spatial clipping operation is to clip a desktop part and parts except the point cloud of the object, the plane extraction operation is to remove the point cloud of the desktop part, and the outlier filtering is to remove the outliers left after the clipping and the plane extraction.
4. The method for detecting the grabbing pose of the robot based on the domain migration under the single-view-point cloud as claimed in claim 1, wherein the uniform random sampling in the step 3 is to use a congruence method to generate a random number to obtain the point cloud index.
5. The method for detecting the grabbing pose of the robot based on the domain migration under the single-view-point cloud according to claim 1, wherein the neighborhood of the single-view-point cloud when calculating the Darbour local frame in the neighborhood by taking each sampling point as the center and constructing the basic pose is represented as an intra-sphere region with the radius r in the step 3.
6. The method for detecting the grabbing pose of the robot based on the domain migration under the single-view-point cloud according to claim 1, wherein the candidate grabbing pose obtaining process in the step 3 is as follows:
(1) Randomly and uniformly sampling on the target point cloud to obtain a sampling point set;
(2) Calculating a basic frame in the neighborhood of each sampling point to obtain a basic pose;
(3) Carrying out pose expansion by rotating and translating a basic pose;
(4) Performing collision detection on each expanded pose, and calculating the deepest grabbing pose;
(5) And eliminating invalid grabbing candidates by judging whether the clamping area corresponding to the deepest grabbing pose contains point clouds or not.
7. The method for detecting the grabbing pose of the robot based on the domain migration under the single-view-point cloud as claimed in claim 1, wherein the simulation dataset in the step 5 is a 3D-NET dataset, and the physical dataset is obtained by three-dimensional reconstruction through a Kinectfusion algorithm.
8. The method for detecting the grabbing pose of the robot based on the domain migration under the single-view-point cloud according to claim 1, wherein the grabbing pose evaluation model of the domain migration in the step 5 comprises a weight self-adaptive basic convolution network, a lightweight full-connected layer and a generation countermeasure network, an alternating optimization mode is adopted in an optimization mode during model training, and the training step comprises:
(1) The input data comprises a simulation data set and a real object data set, and forward calculation is carried out on all the networks;
(2) Updating a discriminator D, calculating the cross entropy loss of the simulation data set, the domain discrimination loss of the simulation data, the domain discrimination loss of the simulation generated data and the domain discrimination loss of the real object generated data, and adding a gradient penalty item to inhibit mode collapse;
(3) Updating a generator G, and using domain discrimination loss and label classification cross entropy loss after the simulation data set passes through the generator and the discriminator;
(4) Updating the classifier C, and only using the classified cross entropy loss of the simulation data set after passing through the F network and the C network;
(5) And updating the feature extractor F, and using the simulation classification loss, the label classification cross entropy loss of the simulation generated data and the domain discrimination loss of the real object generated data.
9. The method for detecting the grabbing pose of the robot based on the domain migration under the single-view-point cloud according to claim 8, wherein a loss function for optimizing the discriminator D is as follows:
Figure FDA0002821342620000021
wherein N represents the number of samples of the batch data, D c And D rf Respectively representing a discriminator label classification branch and a domain discrimination branch, s n In order to simulate a sample,
Figure FDA0002821342620000031
and
Figure FDA0002821342620000032
respectively represent lifeMerging and inputting simulation features and object features of the finished product;
the penalty function used to optimize generator D is:
Figure FDA0002821342620000033
the penalty function for the optimization classifier C is:
Figure FDA0002821342620000034
wherein the content of the first and second substances,
Figure FDA0002821342620000035
a simulated feature input representing a classifier;
the loss function of the optimization feature extractor F is:
Figure FDA0002821342620000036
where α and γ represent the balance weights of the two loss parts, respectively.
CN202011418811.5A 2020-12-07 2020-12-07 Robot grabbing pose detection method based on domain migration under single-view-point cloud Active CN112489117B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011418811.5A CN112489117B (en) 2020-12-07 2020-12-07 Robot grabbing pose detection method based on domain migration under single-view-point cloud

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011418811.5A CN112489117B (en) 2020-12-07 2020-12-07 Robot grabbing pose detection method based on domain migration under single-view-point cloud

Publications (2)

Publication Number Publication Date
CN112489117A CN112489117A (en) 2021-03-12
CN112489117B true CN112489117B (en) 2022-11-18

Family

ID=74940410

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011418811.5A Active CN112489117B (en) 2020-12-07 2020-12-07 Robot grabbing pose detection method based on domain migration under single-view-point cloud

Country Status (1)

Country Link
CN (1) CN112489117B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191387B (en) * 2021-03-27 2024-03-29 西北大学 Cultural relic fragment point cloud classification method combining unsupervised learning and data self-enhancement
CN113345100B (en) * 2021-05-19 2023-04-07 上海非夕机器人科技有限公司 Prediction method, apparatus, device, and medium for target grasp posture of object
CN113297988B (en) * 2021-05-28 2024-03-22 东南大学 Object attitude estimation method based on domain migration and depth completion
CN113674348B (en) * 2021-05-28 2024-03-15 中国科学院自动化研究所 Object grabbing method, device and system
CN113763476B (en) * 2021-09-09 2023-12-01 西交利物浦大学 Object grabbing method, device and storage medium
CN114083535B (en) * 2021-11-18 2023-06-13 清华大学 Physical measurement method and device for grasping gesture quality of robot
CN116703895B (en) * 2023-08-02 2023-11-21 杭州灵西机器人智能科技有限公司 Small sample 3D visual detection method and system based on generation countermeasure network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110363815A (en) * 2019-05-05 2019-10-22 东南大学 The robot that Case-based Reasoning is divided under a kind of haplopia angle point cloud grabs detection method
CN111046948B (en) * 2019-12-10 2022-04-22 浙江大学 Point cloud simulation and deep learning workpiece pose identification and robot feeding method
CN111652928B (en) * 2020-05-11 2023-12-15 上海交通大学 Object grabbing pose detection method in three-dimensional point cloud

Also Published As

Publication number Publication date
CN112489117A (en) 2021-03-12

Similar Documents

Publication Publication Date Title
CN112489117B (en) Robot grabbing pose detection method based on domain migration under single-view-point cloud
CN110598554B (en) Multi-person posture estimation method based on counterstudy
CN108491880B (en) Object classification and pose estimation method based on neural network
Gao et al. Dynamic hand gesture recognition based on 3D hand pose estimation for human–robot interaction
Romero et al. Hands in action: real-time 3D reconstruction of hands in interaction with objects
Wang et al. Graspness discovery in clutters for fast and accurate grasp detection
Hagelskjær et al. Pointvotenet: Accurate object detection and 6 dof pose estimation in point clouds
CN107067410B (en) Manifold regularization related filtering target tracking method based on augmented samples
CN111199207B (en) Two-dimensional multi-human body posture estimation method based on depth residual error neural network
Rashid et al. Language embedded radiance fields for zero-shot task-oriented grasping
CN115830652B (en) Deep palm print recognition device and method
Huu et al. Proposing recognition algorithms for hand gestures based on machine learning model
Laili et al. Custom grasping: A region-based robotic grasping detection method in industrial cyber-physical systems
Peng et al. A self-supervised learning-based 6-DOF grasp planning method for manipulator
Kwan et al. Gesture recognition for initiating human-to-robot handovers
Wu et al. A cascaded CNN-based method for monocular vision robotic grasping
Ikram et al. Real time hand gesture recognition using leap motion controller based on CNN-SVM architechture
Lin et al. Target recognition and optimal grasping based on deep learning
Zhou et al. 6-D object pose estimation using multiscale point cloud transformer
Bergström et al. Integration of visual cues for robotic grasping
Yu et al. Pointnet++ gpd: 6-dof grasping pose detection method based on object point cloud
Li et al. Grasping Detection Based on YOLOv3 Algorithm
Pradeep et al. Recognition of Indian Classical Dance Hand Gestures
Wang et al. A saliency detection model combined local and global features
Zhang et al. Robotic grasp detection using effective graspable feature selection and precise classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant