CN116245940A - Category-level six-degree-of-freedom object pose estimation method based on structure difference perception - Google Patents

Category-level six-degree-of-freedom object pose estimation method based on structure difference perception Download PDF

Info

Publication number
CN116245940A
CN116245940A CN202310052012.8A CN202310052012A CN116245940A CN 116245940 A CN116245940 A CN 116245940A CN 202310052012 A CN202310052012 A CN 202310052012A CN 116245940 A CN116245940 A CN 116245940A
Authority
CN
China
Prior art keywords
category
instance
geometric
features
geometric features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310052012.8A
Other languages
Chinese (zh)
Other versions
CN116245940B (en
Inventor
李嘉茂
李国威
朱冬晨
张广慧
石文君
张晓林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Institute of Microsystem and Information Technology of CAS
Original Assignee
Shanghai Institute of Microsystem and Information Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Institute of Microsystem and Information Technology of CAS filed Critical Shanghai Institute of Microsystem and Information Technology of CAS
Priority to CN202310052012.8A priority Critical patent/CN116245940B/en
Publication of CN116245940A publication Critical patent/CN116245940A/en
Application granted granted Critical
Publication of CN116245940B publication Critical patent/CN116245940B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a category-level six-degree-of-freedom object pose estimation method based on structural difference perception, which comprises the following steps: inputting the depth map into a target detection segmentation network for recognition, obtaining an observation point cloud of an object instance according to a recognition result, and selecting a category prior corresponding to the target object based on the observation point cloud of the object instance; extracting the observation point cloud and the category priori features to obtain instance geometric features and category geometric features; inputting the instance geometric features and the category geometric features into an information interaction enhancement module to obtain enhanced instance geometric features and category geometric features; then, the semantic and geometric information are fused through the semantic dynamic fusion module, so that instance fusion characteristics and category fusion characteristics are obtained; obtaining an instance NOCS model based on the category fusion features; and matching the example NOCS model with the observation point cloud through a matching network, and calculating according to the similarity to obtain the 6D pose and the size of the target object. The method and the device can improve the accuracy of 6D pose estimation.

Description

Category-level six-degree-of-freedom object pose estimation method based on structure difference perception
Technical Field
The invention relates to the technical field of computer vision, in particular to a category-level six-degree-of-freedom object pose estimation method based on structural difference perception.
Background
Estimating the six degrees of freedom (6 d) pose of a real object from a picture is a very critical task, namely estimating the position and orientation of the object under the camera coordinate system, and consists of a three-dimensional rotation matrix and a three-dimensional translation vector. The task of estimating the 6D pose of an object is widely applied to many real scenes, such as 3D scene understanding, robot grabbing, virtual reality, augmented reality and other fields. The 6D pose estimation task can be classified into two categories according to the level of the estimated object: 1. example-level 6D pose estimation for a particular object; 2. category-level 6D pose estimation for the same category of objects. The example level 6D pose estimation task needs to know its own position in the world coordinate system in advance when calculating the pose of the object, and the center of the general world coordinate system falls at the center of the object, i.e. its CAD model. For a new object without a CAD model defined in a real scene, the example-level 6D pose estimation algorithm has no way to estimate the pose of the object, which severely limits the application of the example-level 6D pose estimation algorithm in the real scene. Therefore, to break the limitations of the example-level 6D pose estimation method, a class-level 6D pose estimation task is proposed that is able to estimate the 6D pose of different object instances under the same class, even if some object instances do not have CAD models.
Wang et al first put forward the concept of class-level object 6D pose estimation task, in order to solve the problem of lack of CAD models in estimating the 6D pose of an object, they introduced a Normalized Object Coordinate Space (NOCS), a shared canonical representation of all possible object instances under a class, by first reconstructing the object instance in the NOCS, and then calculating the pose transformation relationship of the object instance from the NOCS space to the camera coordinate system, i.e., the 6D pose of the object. Because different object instances under the same class may have large structural differences, reconstructing their NOCS model is a very difficult task, which is a difficulty for class-level object 6D pose estimation tasks. Aiming at the problem, SPD proposes to learn a category priori for each category, then deform the category priori according to different object examples, reconstruct an NOCS model of the object example, further increase the accuracy of pose estimation, but the category priori information blurring causes the reconstructed NOCS model to be inaccurate.
Disclosure of Invention
The invention aims to provide a category-level six-degree-of-freedom object pose estimation method based on structural difference perception, which can improve the accuracy of 6D pose estimation.
The technical scheme adopted for solving the technical problems is as follows: the utility model provides a category-level six-degree-of-freedom object pose estimation method based on structural difference perception, which comprises the following steps:
inputting the depth map into a target detection segmentation network to obtain an image block of a target object and a segmentation mask of the target object;
obtaining an observation point cloud of an object instance according to the segmentation mask of the target object and the depth map, and selecting a category prior corresponding to the target object based on the observation point cloud of the object instance;
extracting the observation point cloud and the category priori features to obtain instance geometric features and category geometric features;
inputting the instance geometric features and the category geometric features into an information interaction enhancement module, implicitly modeling geometric differences between the instance geometric features and the category geometric features through the information interaction enhancement module, and supplementing the instance geometric features and the category geometric features to obtain enhanced instance geometric features and category geometric features;
inputting the geometric difference between the instance geometric features and the category geometric features, the enhanced instance geometric features and the category geometric features into a semantic dynamic fusion module, and fusing semantic and geometric information through the semantic dynamic fusion module to obtain instance fusion features and category fusion features;
the category fusion features are sent to a deformation network to obtain a deformation field, and the category prior is deformed by using the deformation field to obtain an instance NOCS model;
and matching the example NOCS model with the observation point cloud through a matching network, and calculating according to the similarity to obtain the 6D pose and the size of the target object.
The target detection segmentation network adopts a Mask-RCNN network.
And the characteristics of the observation point cloud and the category priori are extracted by adopting a convolutional neural network and a PointNet++ network.
The information interaction enhancement module comprises: a full connection layer for mapping the instance geometric features and the category geometric features to the same feature subspace, respectively; the matrix multiplication unit is used for carrying out matrix multiplication operation on the instance geometric features and the category geometric features mapped to the same feature subspace to obtain a structural relation matrix between the instance geometric features and the category geometric features; the normalization unit is used for normalizing the structural relation matrix into a weight coefficient; the weighted summation unit is used for carrying out weighted summation on the geometric projection features in the structural relation matrix by adopting the weight coefficient to obtain the example geometric features and the category geometric features; and the multi-layer perceptron is used for respectively fusing the geometric difference with the instance geometric feature and the category geometric feature to obtain the enhanced instance geometric feature and the category geometric feature.
The semantic dynamic fusion module adopts a pixel-level fusion strategy to realize a corresponding point fusion module for the enhanced instance geometric features to explore internal mapping between data sources to obtain instance fusion features, adopts geometric differences between the instance geometric features and the instance geometric features to dynamically adjust the enhanced instance geometric features for the enhanced category geometric features and the instance geometric features of different individuals, and fuses the adjusted enhanced instance geometric features and the enhanced category geometric features to obtain category fusion features.
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: the invention utilizes the structural difference between the object instance and the category priori to enhance the learning of the shape information in the category, further dynamically adjusts the semantic information according to the geometric relationship between the object instance and the category priori through the semantic dynamic fusion module, and then dynamically supplements the missing of the geometric information through fusion with the enhanced category priori so as to improve the robustness to noise.
Drawings
FIG. 1 is a flow chart of a category-level six-degree-of-freedom object pose estimation method based on structural difference perception in an embodiment of the invention;
FIG. 2 is a schematic diagram of an information interaction enhancement module in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a semantic dynamic fusion module in an embodiment of the present invention;
FIG. 4 is a schematic view of an observation point cloud of different object examples;
fig. 5 is a comparison of the results of an embodiment of the present invention and an SPD method.
Detailed Description
The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. Further, it is understood that various changes and modifications may be made by those skilled in the art after reading the teachings of the present invention, and such equivalents are intended to fall within the scope of the claims appended hereto.
The embodiment of the invention relates to a category-level six-degree-of-freedom object pose estimation method based on structural difference perception. As shown in fig. 1, the method comprises the following steps:
and step 1, inputting the depth map into a target detection segmentation network to obtain an image block of a target object and a segmentation mask of the target object. In this step, an existing object detection segmentation network may be used to obtain the image block of the object and its segmentation Mask, for example, a Mask-RCNN network may be used.
And 2, obtaining the observation point cloud of the object instance according to the segmentation mask of the target object and the depth map, and selecting a category prior corresponding to the target object based on the observation point cloud of the object instance.
And step 3, extracting the observation point cloud and the category priori features to obtain instance geometric features and category geometric features. In the step, a convolutional neural network and a PointNet++ can be used for extracting picture semantic features and point cloud geometric features respectively, so that instance geometric features and category geometric features are obtained.
And 4, inputting the instance geometric features and the category geometric features into an information interaction enhancement module, implicitly modeling geometric differences between the instance geometric features and the category geometric features through the information interaction enhancement module, and supplementing the instance geometric features and the category geometric features to obtain enhanced instance geometric features and category geometric features.
The information interaction enhancement module in this step aims to learn the structural relationship between the instance point cloud and the class priors to help build their structural difference information on the feature level. It leverages features of structural differences to supplement the original geometric features, such that the enhanced geometric features include the unique individuality and generic commonality of class priors of example structures. On the one hand, due to the complementation of example structural characteristics, the enhanced class geometry can reconstruct a more accurate example NOCS model. On the other hand, instance geometry increases commonalities in class shape, thereby allowing the reconstructed correspondence matrix to better correlate observed point clouds with the NOCS model. In addition, as the geometric differences between the category prior and different examples under the same category are different, the information interaction enhancement module can adapt to the examples of various shapes which are not seen before, and the generalization of the embodiment is greatly increased.
The structure of the information interaction enhancement module is shown in fig. 2, and includes: a full connection layer for mapping the instance geometric features and the category geometric features to the same feature subspace, respectively; the matrix multiplication unit is used for carrying out matrix multiplication operation on the instance geometric features and the category geometric features mapped to the same feature subspace to obtain a structural relation matrix between the instance geometric features and the category geometric features; the normalization unit is used for normalizing the structural relation matrix into a weight coefficient; the weighted summation unit is used for carrying out weighted summation on the geometric projection features in the structural relation matrix by adopting the weight coefficient to obtain the example geometric features and the category geometric features; and the multi-layer perceptron is used for respectively fusing the geometric difference with the instance geometric feature and the category geometric feature to obtain the enhanced instance geometric feature and the category geometric feature.
Thus for instance geometry and class geometry, they can be mapped to the same feature subspace using the fully connected network layer, and then their structural relationship matrix obtained by a matrix multiplication operation. And normalizing the structural relation matrix into a weight coefficient, and carrying out weighted summation on the geometric projection features to obtain structural difference features. And finally, fusing the original geometric features and the structural difference features by using a multi-layer perceptron to obtain enhanced geometric features.
And 5, inputting the geometric difference between the instance geometric features and the category geometric features, the enhanced instance geometric features and the enhanced category geometric features into a semantic dynamic fusion module, and fusing semantic and geometric information through the semantic dynamic fusion module to obtain instance fusion features and category fusion features.
As shown in fig. 4, the object example point cloud obtained through the object detection and segmentation model may contain a certain noise point. When the influence of these noise points is transferred to the class prior, it will theoretically have a negative effect on the reconstruction accuracy of the NOCS model, resulting in a deviation in the correspondence between the object instance point cloud and its NOCS model. In order to solve the problem, the embodiment designs a semantic dynamic fusion module, and the robustness of the network to noise points is improved by fully fusing geometric and semantic information.
FIG. 3 illustrates a semantic dynamic fusion module that employs a pixel-level fusion strategy to implement a corresponding point fusion module for enhanced instance geometric features to explore intrinsic mappings between data sources, resulting in instance fusion features, and employs geometric differences between instance geometric features and class geometric features to dynamically adjust the enhanced instance geometric features for enhanced class geometric features and instance geometric features of different individuals, fusing the adjusted enhanced instance geometric features with the enhanced class geometric features, resulting in class fusion features. That is, this embodiment uses a pixel-level fusion strategy to implement a corresponding point fusion module to explore the intrinsic mapping between data sources by referencing the method in DenseFile. For class geometry features and instance semantic features from different individuals, the pixel-level fusion strategy cannot be directly used because of their absence of pixel-level correspondence, so this embodiment employs two different fusion strategies. The first is the general idea of feature fusion, which is called direct fusion, by stitching the two together and then fusing by means of an MLP function. While direct fusion strategies can improve performance by absorbing semantic information, there are still under-considerations for cross-individual problems. Therefore, the embodiment also designs a semantic fusion strategy, dynamically adjusts the semantic features of the instance according to the structural relation matrix of the instance and the category, and then fuses with the geometric features of the category.
And 6, sending the category fusion characteristics to a deformation network to obtain a deformation field, and carrying out prior deformation on the category by using the deformation field to obtain an instance NOCS model.
And 7, matching the example NOCS model with the observation point cloud through a matching network, and calculating according to the similarity to obtain the 6D pose and the size of the target object.
As shown in FIG. 5, one of the two sets of frame lines outside the target object is true, and the other set is predicted. Compared with the SPD method, the pose estimated by the method in the embodiment is more accurate, particularly in the category of a camera (an object pointed by an arrow in the figure) with relatively large shape change, the estimated result of the method in the embodiment is much better than that of the SPD method, and the method in the embodiment is proved to be capable of well processing the problem of shape change in the category.
It is easy to find that the invention utilizes the structural difference between the object instance and the category priori to enhance the learning of the shape information in the category, further dynamically adjusts the semantic information according to the geometric relationship between the object instance and the category priori through the semantic dynamic fusion module, and then fuses with the enhanced category priori to dynamically supplement the missing of the geometric information so as to improve the robustness to noise.

Claims (5)

1. The category-level six-degree-of-freedom object pose estimation method based on structural difference perception is characterized by comprising the following steps of:
inputting the depth map into a target detection segmentation network to obtain an image block of a target object and a segmentation mask of the target object; obtaining an observation point cloud of an object instance according to the segmentation mask of the target object and the depth map, and selecting a category prior corresponding to the target object based on the observation point cloud of the object instance;
extracting the observation point cloud and the category priori features to obtain instance geometric features and category geometric features;
inputting the instance geometric features and the category geometric features into an information interaction enhancement module, implicitly modeling geometric differences between the instance geometric features and the category geometric features through the information interaction enhancement module, and supplementing the instance geometric features and the category geometric features to obtain enhanced instance geometric features and category geometric features;
inputting the geometric difference between the instance geometric features and the category geometric features, the enhanced instance geometric features and the category geometric features into a semantic dynamic fusion module, and fusing semantic and geometric information through the semantic dynamic fusion module to obtain instance fusion features and category fusion features;
the category fusion features are sent to a deformation network to obtain a deformation field, and the category prior is deformed by using the deformation field to obtain an instance NOCS model;
and matching the example NOCS model with the observation point cloud through a matching network, and calculating according to the similarity to obtain the 6D pose and the size of the target object.
2. The method for estimating the pose of the class-level six-degree-of-freedom object based on the structural difference perception according to claim 1, wherein the object detection segmentation network adopts a Mask-RCNN network.
3. The method for estimating the pose of the category-level six-degree-of-freedom object based on the structural difference perception according to claim 1, wherein the feature of the observation point cloud and the category prior is extracted by adopting a convolutional neural network and a PointNet++ network.
4. The method for estimating the pose of the category-level six-degree-of-freedom object based on the structural difference perception according to claim 1, wherein the information interaction enhancement module comprises: a full connection layer for mapping the instance geometric features and the category geometric features to the same feature subspace, respectively; the matrix multiplication unit is used for carrying out matrix multiplication operation on the instance geometric features and the category geometric features mapped to the same feature subspace to obtain a structural relation matrix between the instance geometric features and the category geometric features; the normalization unit is used for normalizing the structural relation matrix into a weight coefficient; the weighted summation unit is used for carrying out weighted summation on the geometric projection features in the structural relation matrix by adopting the weight coefficient to obtain the example geometric features and the category geometric features; and the multi-layer perceptron is used for respectively fusing the geometric difference with the instance geometric feature and the category geometric feature to obtain the enhanced instance geometric feature and the category geometric feature.
5. The method for estimating the pose of the class-level six-degree-of-freedom object based on the structural difference perception according to claim 1, wherein the semantic dynamic fusion module adopts a pixel-level fusion strategy to realize a corresponding point fusion module for the enhanced instance geometric feature to explore internal mapping between data sources, so as to obtain an instance fusion feature, adopts geometric differences between the instance geometric feature and the class geometric feature to dynamically adjust the enhanced instance geometric feature for the enhanced class geometric feature and the instance geometric feature of different individuals, and fuses the adjusted enhanced instance geometric feature and the enhanced class geometric feature to obtain the class fusion feature.
CN202310052012.8A 2023-02-02 2023-02-02 Category-level six-degree-of-freedom object pose estimation method based on structure difference perception Active CN116245940B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310052012.8A CN116245940B (en) 2023-02-02 2023-02-02 Category-level six-degree-of-freedom object pose estimation method based on structure difference perception

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310052012.8A CN116245940B (en) 2023-02-02 2023-02-02 Category-level six-degree-of-freedom object pose estimation method based on structure difference perception

Publications (2)

Publication Number Publication Date
CN116245940A true CN116245940A (en) 2023-06-09
CN116245940B CN116245940B (en) 2024-04-05

Family

ID=86634232

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310052012.8A Active CN116245940B (en) 2023-02-02 2023-02-02 Category-level six-degree-of-freedom object pose estimation method based on structure difference perception

Country Status (1)

Country Link
CN (1) CN116245940B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5850469A (en) * 1996-07-09 1998-12-15 General Electric Company Real time tracking of camera pose
CN110119148A (en) * 2019-05-14 2019-08-13 深圳大学 A kind of six-degree-of-freedom posture estimation method, device and computer readable storage medium
CN112767478A (en) * 2021-01-08 2021-05-07 北京航空航天大学 Appearance guidance-based six-degree-of-freedom pose estimation method
CN113393503A (en) * 2021-05-24 2021-09-14 湖南大学 Classification-driven shape prior deformation category-level object 6D pose estimation method
CN114299150A (en) * 2021-12-31 2022-04-08 河北工业大学 Depth 6D pose estimation network model and workpiece pose estimation method
KR20220065234A (en) * 2020-11-13 2022-05-20 주식회사 플라잎 Apparatus and method for estimating of 6d pose
KR20220088289A (en) * 2020-12-18 2022-06-27 삼성전자주식회사 Apparatus and method for estimating object pose
CN114863573A (en) * 2022-07-08 2022-08-05 东南大学 Category-level 6D attitude estimation method based on monocular RGB-D image
US20220292698A1 (en) * 2021-03-11 2022-09-15 Fudan University Network and System for Pose and Size Estimation
CN115187748A (en) * 2022-07-14 2022-10-14 湘潭大学 Centroid and pose estimation of object based on category level of point cloud
US20220362945A1 (en) * 2021-05-14 2022-11-17 Industrial Technology Research Institute Object pose estimation system, execution method thereof and graphic user interface

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5850469A (en) * 1996-07-09 1998-12-15 General Electric Company Real time tracking of camera pose
CN110119148A (en) * 2019-05-14 2019-08-13 深圳大学 A kind of six-degree-of-freedom posture estimation method, device and computer readable storage medium
KR20220065234A (en) * 2020-11-13 2022-05-20 주식회사 플라잎 Apparatus and method for estimating of 6d pose
KR20220088289A (en) * 2020-12-18 2022-06-27 삼성전자주식회사 Apparatus and method for estimating object pose
CN112767478A (en) * 2021-01-08 2021-05-07 北京航空航天大学 Appearance guidance-based six-degree-of-freedom pose estimation method
US20220292698A1 (en) * 2021-03-11 2022-09-15 Fudan University Network and System for Pose and Size Estimation
US20220362945A1 (en) * 2021-05-14 2022-11-17 Industrial Technology Research Institute Object pose estimation system, execution method thereof and graphic user interface
CN113393503A (en) * 2021-05-24 2021-09-14 湖南大学 Classification-driven shape prior deformation category-level object 6D pose estimation method
CN114299150A (en) * 2021-12-31 2022-04-08 河北工业大学 Depth 6D pose estimation network model and workpiece pose estimation method
CN114863573A (en) * 2022-07-08 2022-08-05 东南大学 Category-level 6D attitude estimation method based on monocular RGB-D image
CN115187748A (en) * 2022-07-14 2022-10-14 湘潭大学 Centroid and pose estimation of object based on category level of point cloud

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LU ZOU 等: "6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning", 《IEEE》 *
MENG TIAN 等: "Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation", 《ARXIV:2007.08454V1》 *
桑晗博 等: "基于深度三维模型表征的类别级六维位姿估计", 《 中国传媒大学学报(自然科学版)》 *

Also Published As

Publication number Publication date
CN116245940B (en) 2024-04-05

Similar Documents

Publication Publication Date Title
CN109377530B (en) Binocular depth estimation method based on depth neural network
US10109055B2 (en) Multiple hypotheses segmentation-guided 3D object detection and pose estimation
CN110555412B (en) End-to-end human body gesture recognition method based on combination of RGB and point cloud
US20200184726A1 (en) Implementing three-dimensional augmented reality in smart glasses based on two-dimensional data
CN110852182B (en) Depth video human body behavior recognition method based on three-dimensional space time sequence modeling
CN109359514B (en) DeskVR-oriented gesture tracking and recognition combined strategy method
CN113298934B (en) Monocular visual image three-dimensional reconstruction method and system based on bidirectional matching
CN111311729A (en) Natural scene three-dimensional human body posture reconstruction method based on bidirectional projection network
CN113283525B (en) Image matching method based on deep learning
CN113628348A (en) Method and equipment for determining viewpoint path in three-dimensional scene
CN111709450A (en) Point cloud normal vector estimation method and system based on multi-scale feature fusion
CN110751097A (en) Semi-supervised three-dimensional point cloud gesture key point detection method
Zhang et al. Robust-FusionNet: Deep multimodal sensor fusion for 3-D object detection under severe weather conditions
Li et al. Deep learning based monocular depth prediction: Datasets, methods and applications
CN116310098A (en) Multi-view three-dimensional reconstruction method based on attention mechanism and variable convolution depth network
CN115393519A (en) Three-dimensional reconstruction method based on infrared and visible light fusion image
CN114140527A (en) Dynamic environment binocular vision SLAM method based on semantic segmentation
CN116245940B (en) Category-level six-degree-of-freedom object pose estimation method based on structure difference perception
Li et al. Monocular 3-D Object Detection Based on Depth-Guided Local Convolution for Smart Payment in D2D Systems
CN117351078A (en) Target size and 6D gesture estimation method based on shape priori
CN115330874B (en) Monocular depth estimation method based on superpixel processing shielding
CN115359193B (en) Rapid semi-dense three-dimensional reconstruction method based on binocular fisheye camera
Akizuki et al. ASM-Net: Category-level Pose and Shape Estimation Using Parametric Deformation.
CN112435345B (en) Human body three-dimensional measurement method and system based on deep learning
CN115330935A (en) Three-dimensional reconstruction method and system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant