CN116245940A - Category-level six-degree-of-freedom object pose estimation method based on structure difference perception - Google Patents
Category-level six-degree-of-freedom object pose estimation method based on structure difference perception Download PDFInfo
- Publication number
- CN116245940A CN116245940A CN202310052012.8A CN202310052012A CN116245940A CN 116245940 A CN116245940 A CN 116245940A CN 202310052012 A CN202310052012 A CN 202310052012A CN 116245940 A CN116245940 A CN 116245940A
- Authority
- CN
- China
- Prior art keywords
- category
- instance
- geometric
- features
- geometric features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 230000008447 perception Effects 0.000 title claims abstract description 13
- 230000004927 fusion Effects 0.000 claims abstract description 47
- 230000011218 segmentation Effects 0.000 claims abstract description 15
- 230000003993 interaction Effects 0.000 claims abstract description 13
- 238000001514 detection method Methods 0.000 claims abstract description 8
- 239000011159 matrix material Substances 0.000 claims description 21
- 238000013507 mapping Methods 0.000 claims description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000001502 supplementing effect Effects 0.000 claims description 3
- 239000013589 supplement Substances 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a category-level six-degree-of-freedom object pose estimation method based on structural difference perception, which comprises the following steps: inputting the depth map into a target detection segmentation network for recognition, obtaining an observation point cloud of an object instance according to a recognition result, and selecting a category prior corresponding to the target object based on the observation point cloud of the object instance; extracting the observation point cloud and the category priori features to obtain instance geometric features and category geometric features; inputting the instance geometric features and the category geometric features into an information interaction enhancement module to obtain enhanced instance geometric features and category geometric features; then, the semantic and geometric information are fused through the semantic dynamic fusion module, so that instance fusion characteristics and category fusion characteristics are obtained; obtaining an instance NOCS model based on the category fusion features; and matching the example NOCS model with the observation point cloud through a matching network, and calculating according to the similarity to obtain the 6D pose and the size of the target object. The method and the device can improve the accuracy of 6D pose estimation.
Description
Technical Field
The invention relates to the technical field of computer vision, in particular to a category-level six-degree-of-freedom object pose estimation method based on structural difference perception.
Background
Estimating the six degrees of freedom (6 d) pose of a real object from a picture is a very critical task, namely estimating the position and orientation of the object under the camera coordinate system, and consists of a three-dimensional rotation matrix and a three-dimensional translation vector. The task of estimating the 6D pose of an object is widely applied to many real scenes, such as 3D scene understanding, robot grabbing, virtual reality, augmented reality and other fields. The 6D pose estimation task can be classified into two categories according to the level of the estimated object: 1. example-level 6D pose estimation for a particular object; 2. category-level 6D pose estimation for the same category of objects. The example level 6D pose estimation task needs to know its own position in the world coordinate system in advance when calculating the pose of the object, and the center of the general world coordinate system falls at the center of the object, i.e. its CAD model. For a new object without a CAD model defined in a real scene, the example-level 6D pose estimation algorithm has no way to estimate the pose of the object, which severely limits the application of the example-level 6D pose estimation algorithm in the real scene. Therefore, to break the limitations of the example-level 6D pose estimation method, a class-level 6D pose estimation task is proposed that is able to estimate the 6D pose of different object instances under the same class, even if some object instances do not have CAD models.
Wang et al first put forward the concept of class-level object 6D pose estimation task, in order to solve the problem of lack of CAD models in estimating the 6D pose of an object, they introduced a Normalized Object Coordinate Space (NOCS), a shared canonical representation of all possible object instances under a class, by first reconstructing the object instance in the NOCS, and then calculating the pose transformation relationship of the object instance from the NOCS space to the camera coordinate system, i.e., the 6D pose of the object. Because different object instances under the same class may have large structural differences, reconstructing their NOCS model is a very difficult task, which is a difficulty for class-level object 6D pose estimation tasks. Aiming at the problem, SPD proposes to learn a category priori for each category, then deform the category priori according to different object examples, reconstruct an NOCS model of the object example, further increase the accuracy of pose estimation, but the category priori information blurring causes the reconstructed NOCS model to be inaccurate.
Disclosure of Invention
The invention aims to provide a category-level six-degree-of-freedom object pose estimation method based on structural difference perception, which can improve the accuracy of 6D pose estimation.
The technical scheme adopted for solving the technical problems is as follows: the utility model provides a category-level six-degree-of-freedom object pose estimation method based on structural difference perception, which comprises the following steps:
inputting the depth map into a target detection segmentation network to obtain an image block of a target object and a segmentation mask of the target object;
obtaining an observation point cloud of an object instance according to the segmentation mask of the target object and the depth map, and selecting a category prior corresponding to the target object based on the observation point cloud of the object instance;
extracting the observation point cloud and the category priori features to obtain instance geometric features and category geometric features;
inputting the instance geometric features and the category geometric features into an information interaction enhancement module, implicitly modeling geometric differences between the instance geometric features and the category geometric features through the information interaction enhancement module, and supplementing the instance geometric features and the category geometric features to obtain enhanced instance geometric features and category geometric features;
inputting the geometric difference between the instance geometric features and the category geometric features, the enhanced instance geometric features and the category geometric features into a semantic dynamic fusion module, and fusing semantic and geometric information through the semantic dynamic fusion module to obtain instance fusion features and category fusion features;
the category fusion features are sent to a deformation network to obtain a deformation field, and the category prior is deformed by using the deformation field to obtain an instance NOCS model;
and matching the example NOCS model with the observation point cloud through a matching network, and calculating according to the similarity to obtain the 6D pose and the size of the target object.
The target detection segmentation network adopts a Mask-RCNN network.
And the characteristics of the observation point cloud and the category priori are extracted by adopting a convolutional neural network and a PointNet++ network.
The information interaction enhancement module comprises: a full connection layer for mapping the instance geometric features and the category geometric features to the same feature subspace, respectively; the matrix multiplication unit is used for carrying out matrix multiplication operation on the instance geometric features and the category geometric features mapped to the same feature subspace to obtain a structural relation matrix between the instance geometric features and the category geometric features; the normalization unit is used for normalizing the structural relation matrix into a weight coefficient; the weighted summation unit is used for carrying out weighted summation on the geometric projection features in the structural relation matrix by adopting the weight coefficient to obtain the example geometric features and the category geometric features; and the multi-layer perceptron is used for respectively fusing the geometric difference with the instance geometric feature and the category geometric feature to obtain the enhanced instance geometric feature and the category geometric feature.
The semantic dynamic fusion module adopts a pixel-level fusion strategy to realize a corresponding point fusion module for the enhanced instance geometric features to explore internal mapping between data sources to obtain instance fusion features, adopts geometric differences between the instance geometric features and the instance geometric features to dynamically adjust the enhanced instance geometric features for the enhanced category geometric features and the instance geometric features of different individuals, and fuses the adjusted enhanced instance geometric features and the enhanced category geometric features to obtain category fusion features.
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: the invention utilizes the structural difference between the object instance and the category priori to enhance the learning of the shape information in the category, further dynamically adjusts the semantic information according to the geometric relationship between the object instance and the category priori through the semantic dynamic fusion module, and then dynamically supplements the missing of the geometric information through fusion with the enhanced category priori so as to improve the robustness to noise.
Drawings
FIG. 1 is a flow chart of a category-level six-degree-of-freedom object pose estimation method based on structural difference perception in an embodiment of the invention;
FIG. 2 is a schematic diagram of an information interaction enhancement module in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a semantic dynamic fusion module in an embodiment of the present invention;
FIG. 4 is a schematic view of an observation point cloud of different object examples;
fig. 5 is a comparison of the results of an embodiment of the present invention and an SPD method.
Detailed Description
The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. Further, it is understood that various changes and modifications may be made by those skilled in the art after reading the teachings of the present invention, and such equivalents are intended to fall within the scope of the claims appended hereto.
The embodiment of the invention relates to a category-level six-degree-of-freedom object pose estimation method based on structural difference perception. As shown in fig. 1, the method comprises the following steps:
and step 1, inputting the depth map into a target detection segmentation network to obtain an image block of a target object and a segmentation mask of the target object. In this step, an existing object detection segmentation network may be used to obtain the image block of the object and its segmentation Mask, for example, a Mask-RCNN network may be used.
And 2, obtaining the observation point cloud of the object instance according to the segmentation mask of the target object and the depth map, and selecting a category prior corresponding to the target object based on the observation point cloud of the object instance.
And step 3, extracting the observation point cloud and the category priori features to obtain instance geometric features and category geometric features. In the step, a convolutional neural network and a PointNet++ can be used for extracting picture semantic features and point cloud geometric features respectively, so that instance geometric features and category geometric features are obtained.
And 4, inputting the instance geometric features and the category geometric features into an information interaction enhancement module, implicitly modeling geometric differences between the instance geometric features and the category geometric features through the information interaction enhancement module, and supplementing the instance geometric features and the category geometric features to obtain enhanced instance geometric features and category geometric features.
The information interaction enhancement module in this step aims to learn the structural relationship between the instance point cloud and the class priors to help build their structural difference information on the feature level. It leverages features of structural differences to supplement the original geometric features, such that the enhanced geometric features include the unique individuality and generic commonality of class priors of example structures. On the one hand, due to the complementation of example structural characteristics, the enhanced class geometry can reconstruct a more accurate example NOCS model. On the other hand, instance geometry increases commonalities in class shape, thereby allowing the reconstructed correspondence matrix to better correlate observed point clouds with the NOCS model. In addition, as the geometric differences between the category prior and different examples under the same category are different, the information interaction enhancement module can adapt to the examples of various shapes which are not seen before, and the generalization of the embodiment is greatly increased.
The structure of the information interaction enhancement module is shown in fig. 2, and includes: a full connection layer for mapping the instance geometric features and the category geometric features to the same feature subspace, respectively; the matrix multiplication unit is used for carrying out matrix multiplication operation on the instance geometric features and the category geometric features mapped to the same feature subspace to obtain a structural relation matrix between the instance geometric features and the category geometric features; the normalization unit is used for normalizing the structural relation matrix into a weight coefficient; the weighted summation unit is used for carrying out weighted summation on the geometric projection features in the structural relation matrix by adopting the weight coefficient to obtain the example geometric features and the category geometric features; and the multi-layer perceptron is used for respectively fusing the geometric difference with the instance geometric feature and the category geometric feature to obtain the enhanced instance geometric feature and the category geometric feature.
Thus for instance geometry and class geometry, they can be mapped to the same feature subspace using the fully connected network layer, and then their structural relationship matrix obtained by a matrix multiplication operation. And normalizing the structural relation matrix into a weight coefficient, and carrying out weighted summation on the geometric projection features to obtain structural difference features. And finally, fusing the original geometric features and the structural difference features by using a multi-layer perceptron to obtain enhanced geometric features.
And 5, inputting the geometric difference between the instance geometric features and the category geometric features, the enhanced instance geometric features and the enhanced category geometric features into a semantic dynamic fusion module, and fusing semantic and geometric information through the semantic dynamic fusion module to obtain instance fusion features and category fusion features.
As shown in fig. 4, the object example point cloud obtained through the object detection and segmentation model may contain a certain noise point. When the influence of these noise points is transferred to the class prior, it will theoretically have a negative effect on the reconstruction accuracy of the NOCS model, resulting in a deviation in the correspondence between the object instance point cloud and its NOCS model. In order to solve the problem, the embodiment designs a semantic dynamic fusion module, and the robustness of the network to noise points is improved by fully fusing geometric and semantic information.
FIG. 3 illustrates a semantic dynamic fusion module that employs a pixel-level fusion strategy to implement a corresponding point fusion module for enhanced instance geometric features to explore intrinsic mappings between data sources, resulting in instance fusion features, and employs geometric differences between instance geometric features and class geometric features to dynamically adjust the enhanced instance geometric features for enhanced class geometric features and instance geometric features of different individuals, fusing the adjusted enhanced instance geometric features with the enhanced class geometric features, resulting in class fusion features. That is, this embodiment uses a pixel-level fusion strategy to implement a corresponding point fusion module to explore the intrinsic mapping between data sources by referencing the method in DenseFile. For class geometry features and instance semantic features from different individuals, the pixel-level fusion strategy cannot be directly used because of their absence of pixel-level correspondence, so this embodiment employs two different fusion strategies. The first is the general idea of feature fusion, which is called direct fusion, by stitching the two together and then fusing by means of an MLP function. While direct fusion strategies can improve performance by absorbing semantic information, there are still under-considerations for cross-individual problems. Therefore, the embodiment also designs a semantic fusion strategy, dynamically adjusts the semantic features of the instance according to the structural relation matrix of the instance and the category, and then fuses with the geometric features of the category.
And 6, sending the category fusion characteristics to a deformation network to obtain a deformation field, and carrying out prior deformation on the category by using the deformation field to obtain an instance NOCS model.
And 7, matching the example NOCS model with the observation point cloud through a matching network, and calculating according to the similarity to obtain the 6D pose and the size of the target object.
As shown in FIG. 5, one of the two sets of frame lines outside the target object is true, and the other set is predicted. Compared with the SPD method, the pose estimated by the method in the embodiment is more accurate, particularly in the category of a camera (an object pointed by an arrow in the figure) with relatively large shape change, the estimated result of the method in the embodiment is much better than that of the SPD method, and the method in the embodiment is proved to be capable of well processing the problem of shape change in the category.
It is easy to find that the invention utilizes the structural difference between the object instance and the category priori to enhance the learning of the shape information in the category, further dynamically adjusts the semantic information according to the geometric relationship between the object instance and the category priori through the semantic dynamic fusion module, and then fuses with the enhanced category priori to dynamically supplement the missing of the geometric information so as to improve the robustness to noise.
Claims (5)
1. The category-level six-degree-of-freedom object pose estimation method based on structural difference perception is characterized by comprising the following steps of:
inputting the depth map into a target detection segmentation network to obtain an image block of a target object and a segmentation mask of the target object; obtaining an observation point cloud of an object instance according to the segmentation mask of the target object and the depth map, and selecting a category prior corresponding to the target object based on the observation point cloud of the object instance;
extracting the observation point cloud and the category priori features to obtain instance geometric features and category geometric features;
inputting the instance geometric features and the category geometric features into an information interaction enhancement module, implicitly modeling geometric differences between the instance geometric features and the category geometric features through the information interaction enhancement module, and supplementing the instance geometric features and the category geometric features to obtain enhanced instance geometric features and category geometric features;
inputting the geometric difference between the instance geometric features and the category geometric features, the enhanced instance geometric features and the category geometric features into a semantic dynamic fusion module, and fusing semantic and geometric information through the semantic dynamic fusion module to obtain instance fusion features and category fusion features;
the category fusion features are sent to a deformation network to obtain a deformation field, and the category prior is deformed by using the deformation field to obtain an instance NOCS model;
and matching the example NOCS model with the observation point cloud through a matching network, and calculating according to the similarity to obtain the 6D pose and the size of the target object.
2. The method for estimating the pose of the class-level six-degree-of-freedom object based on the structural difference perception according to claim 1, wherein the object detection segmentation network adopts a Mask-RCNN network.
3. The method for estimating the pose of the category-level six-degree-of-freedom object based on the structural difference perception according to claim 1, wherein the feature of the observation point cloud and the category prior is extracted by adopting a convolutional neural network and a PointNet++ network.
4. The method for estimating the pose of the category-level six-degree-of-freedom object based on the structural difference perception according to claim 1, wherein the information interaction enhancement module comprises: a full connection layer for mapping the instance geometric features and the category geometric features to the same feature subspace, respectively; the matrix multiplication unit is used for carrying out matrix multiplication operation on the instance geometric features and the category geometric features mapped to the same feature subspace to obtain a structural relation matrix between the instance geometric features and the category geometric features; the normalization unit is used for normalizing the structural relation matrix into a weight coefficient; the weighted summation unit is used for carrying out weighted summation on the geometric projection features in the structural relation matrix by adopting the weight coefficient to obtain the example geometric features and the category geometric features; and the multi-layer perceptron is used for respectively fusing the geometric difference with the instance geometric feature and the category geometric feature to obtain the enhanced instance geometric feature and the category geometric feature.
5. The method for estimating the pose of the class-level six-degree-of-freedom object based on the structural difference perception according to claim 1, wherein the semantic dynamic fusion module adopts a pixel-level fusion strategy to realize a corresponding point fusion module for the enhanced instance geometric feature to explore internal mapping between data sources, so as to obtain an instance fusion feature, adopts geometric differences between the instance geometric feature and the class geometric feature to dynamically adjust the enhanced instance geometric feature for the enhanced class geometric feature and the instance geometric feature of different individuals, and fuses the adjusted enhanced instance geometric feature and the enhanced class geometric feature to obtain the class fusion feature.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310052012.8A CN116245940B (en) | 2023-02-02 | 2023-02-02 | Category-level six-degree-of-freedom object pose estimation method based on structure difference perception |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310052012.8A CN116245940B (en) | 2023-02-02 | 2023-02-02 | Category-level six-degree-of-freedom object pose estimation method based on structure difference perception |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116245940A true CN116245940A (en) | 2023-06-09 |
CN116245940B CN116245940B (en) | 2024-04-05 |
Family
ID=86634232
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310052012.8A Active CN116245940B (en) | 2023-02-02 | 2023-02-02 | Category-level six-degree-of-freedom object pose estimation method based on structure difference perception |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116245940B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5850469A (en) * | 1996-07-09 | 1998-12-15 | General Electric Company | Real time tracking of camera pose |
CN110119148A (en) * | 2019-05-14 | 2019-08-13 | 深圳大学 | A kind of six-degree-of-freedom posture estimation method, device and computer readable storage medium |
CN112767478A (en) * | 2021-01-08 | 2021-05-07 | 北京航空航天大学 | Appearance guidance-based six-degree-of-freedom pose estimation method |
CN113393503A (en) * | 2021-05-24 | 2021-09-14 | 湖南大学 | Classification-driven shape prior deformation category-level object 6D pose estimation method |
CN114299150A (en) * | 2021-12-31 | 2022-04-08 | 河北工业大学 | Depth 6D pose estimation network model and workpiece pose estimation method |
KR20220065234A (en) * | 2020-11-13 | 2022-05-20 | 주식회사 플라잎 | Apparatus and method for estimating of 6d pose |
KR20220088289A (en) * | 2020-12-18 | 2022-06-27 | 삼성전자주식회사 | Apparatus and method for estimating object pose |
CN114863573A (en) * | 2022-07-08 | 2022-08-05 | 东南大学 | Category-level 6D attitude estimation method based on monocular RGB-D image |
US20220292698A1 (en) * | 2021-03-11 | 2022-09-15 | Fudan University | Network and System for Pose and Size Estimation |
CN115187748A (en) * | 2022-07-14 | 2022-10-14 | 湘潭大学 | Centroid and pose estimation of object based on category level of point cloud |
US20220362945A1 (en) * | 2021-05-14 | 2022-11-17 | Industrial Technology Research Institute | Object pose estimation system, execution method thereof and graphic user interface |
-
2023
- 2023-02-02 CN CN202310052012.8A patent/CN116245940B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5850469A (en) * | 1996-07-09 | 1998-12-15 | General Electric Company | Real time tracking of camera pose |
CN110119148A (en) * | 2019-05-14 | 2019-08-13 | 深圳大学 | A kind of six-degree-of-freedom posture estimation method, device and computer readable storage medium |
KR20220065234A (en) * | 2020-11-13 | 2022-05-20 | 주식회사 플라잎 | Apparatus and method for estimating of 6d pose |
KR20220088289A (en) * | 2020-12-18 | 2022-06-27 | 삼성전자주식회사 | Apparatus and method for estimating object pose |
CN112767478A (en) * | 2021-01-08 | 2021-05-07 | 北京航空航天大学 | Appearance guidance-based six-degree-of-freedom pose estimation method |
US20220292698A1 (en) * | 2021-03-11 | 2022-09-15 | Fudan University | Network and System for Pose and Size Estimation |
US20220362945A1 (en) * | 2021-05-14 | 2022-11-17 | Industrial Technology Research Institute | Object pose estimation system, execution method thereof and graphic user interface |
CN113393503A (en) * | 2021-05-24 | 2021-09-14 | 湖南大学 | Classification-driven shape prior deformation category-level object 6D pose estimation method |
CN114299150A (en) * | 2021-12-31 | 2022-04-08 | 河北工业大学 | Depth 6D pose estimation network model and workpiece pose estimation method |
CN114863573A (en) * | 2022-07-08 | 2022-08-05 | 东南大学 | Category-level 6D attitude estimation method based on monocular RGB-D image |
CN115187748A (en) * | 2022-07-14 | 2022-10-14 | 湘潭大学 | Centroid and pose estimation of object based on category level of point cloud |
Non-Patent Citations (3)
Title |
---|
LU ZOU 等: "6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning", 《IEEE》 * |
MENG TIAN 等: "Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation", 《ARXIV:2007.08454V1》 * |
桑晗博 等: "基于深度三维模型表征的类别级六维位姿估计", 《 中国传媒大学学报(自然科学版)》 * |
Also Published As
Publication number | Publication date |
---|---|
CN116245940B (en) | 2024-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109377530B (en) | Binocular depth estimation method based on depth neural network | |
US10109055B2 (en) | Multiple hypotheses segmentation-guided 3D object detection and pose estimation | |
CN110555412B (en) | End-to-end human body gesture recognition method based on combination of RGB and point cloud | |
US20200184726A1 (en) | Implementing three-dimensional augmented reality in smart glasses based on two-dimensional data | |
CN110852182B (en) | Depth video human body behavior recognition method based on three-dimensional space time sequence modeling | |
CN109359514B (en) | DeskVR-oriented gesture tracking and recognition combined strategy method | |
CN113298934B (en) | Monocular visual image three-dimensional reconstruction method and system based on bidirectional matching | |
CN111311729A (en) | Natural scene three-dimensional human body posture reconstruction method based on bidirectional projection network | |
CN113283525B (en) | Image matching method based on deep learning | |
CN113628348A (en) | Method and equipment for determining viewpoint path in three-dimensional scene | |
CN111709450A (en) | Point cloud normal vector estimation method and system based on multi-scale feature fusion | |
CN110751097A (en) | Semi-supervised three-dimensional point cloud gesture key point detection method | |
Zhang et al. | Robust-FusionNet: Deep multimodal sensor fusion for 3-D object detection under severe weather conditions | |
Li et al. | Deep learning based monocular depth prediction: Datasets, methods and applications | |
CN116310098A (en) | Multi-view three-dimensional reconstruction method based on attention mechanism and variable convolution depth network | |
CN115393519A (en) | Three-dimensional reconstruction method based on infrared and visible light fusion image | |
CN114140527A (en) | Dynamic environment binocular vision SLAM method based on semantic segmentation | |
CN116245940B (en) | Category-level six-degree-of-freedom object pose estimation method based on structure difference perception | |
Li et al. | Monocular 3-D Object Detection Based on Depth-Guided Local Convolution for Smart Payment in D2D Systems | |
CN117351078A (en) | Target size and 6D gesture estimation method based on shape priori | |
CN115330874B (en) | Monocular depth estimation method based on superpixel processing shielding | |
CN115359193B (en) | Rapid semi-dense three-dimensional reconstruction method based on binocular fisheye camera | |
Akizuki et al. | ASM-Net: Category-level Pose and Shape Estimation Using Parametric Deformation. | |
CN112435345B (en) | Human body three-dimensional measurement method and system based on deep learning | |
CN115330935A (en) | Three-dimensional reconstruction method and system based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |