CN114972525B - Robot grabbing and augmented reality-oriented space target attitude estimation method - Google Patents

Robot grabbing and augmented reality-oriented space target attitude estimation method Download PDF

Info

Publication number
CN114972525B
CN114972525B CN202210422447.2A CN202210422447A CN114972525B CN 114972525 B CN114972525 B CN 114972525B CN 202210422447 A CN202210422447 A CN 202210422447A CN 114972525 B CN114972525 B CN 114972525B
Authority
CN
China
Prior art keywords
points
feature
camera
scale
steps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210422447.2A
Other languages
Chinese (zh)
Other versions
CN114972525A (en
Inventor
吴鹏
王俊骁
王晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Sci Tech University ZSTU
Original Assignee
Zhejiang Sci Tech University ZSTU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Sci Tech University ZSTU filed Critical Zhejiang Sci Tech University ZSTU
Priority to CN202210422447.2A priority Critical patent/CN114972525B/en
Publication of CN114972525A publication Critical patent/CN114972525A/en
Application granted granted Critical
Publication of CN114972525B publication Critical patent/CN114972525B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a six-degree-of-freedom gesture estimation method for a space target for robot grabbing and augmented reality, which is proposed and realized based on a depth full convolution network by combining the characteristics of a multi-scale bounding box of a target 3D model according to the requirements of the augmented reality and cooperative robot technology on six-degree-of-freedom gesture information of the space target.

Description

Robot grabbing and augmented reality-oriented space target attitude estimation method
Technical Field
The invention relates to the fields of augmented reality, robot grabbing and the like, in particular to a six-degree-of-freedom gesture estimation method based on a space target multi-scale bounding box.
Background
In the fields of augmented reality, collaborative robot gripping, and the like, the spatial pose of an object is an indispensable information. In the traditional method, a scene and object point clouds are acquired through a laser camera, and the relative pose of an object is acquired through point cloud registration. However, because the point cloud information is huge and redundant and has certain requirements on equipment, the light and rapid task requirements are difficult to meet.
Disclosure of Invention
Therefore, the invention provides the six-degree-of-freedom gesture estimation method based on the RGB image, which does not need additional information acquisition equipment and can ensure stable and rapid six-degree-of-freedom gesture estimation.
A space target attitude estimation method for robot grabbing and augmented reality comprises the following steps:
step 1, calibrating a camera to obtain a camera internal reference, calculating a 3D model of an object, obtaining a multi-scale bounding box of the object, and mapping the multi-scale bounding box onto a 2D image through the camera internal reference matrix;
Step 2, taking the component points of the multi-scale bounding box as characteristic points, training a full convolution neural network to detect and locate the characteristic points, taking RGB images as input by the network, and outputting a Gaussian heat map about the characteristic points;
Step 3, performing non-maximum suppression on the Gaussian heat map output by the neural network to obtain specific characteristic point two-dimensional coordinates;
Step 4, restoring the corresponding relation of the 2D-3D characteristic points into a six-degree-of-freedom gesture of the space target through an improved EPnP algorithm, so as to provide a basis for subsequent grabbing work;
The specific implementation of step 1 comprises the following sub-steps,
Step 1.1, obtaining RGB camera internal parameters through chessboard calibration;
Step 1.2, calculating maximum values and minimum values of x, y and z axes of the 3D model of the object under the space coordinates, so as to obtain a boundary frame of the common size of the object; calculating average value points of maximum values and minimum values of x, y and z axes, multiplying the lengths (the maximum value minus the minimum value) on the axes by coefficients, and obtaining bounding boxes of different scales of the 3D model;
step 1.3, projecting a 3D multi-scale boundary frame of an object to pictures of different scenes through an internal reference matrix of a camera and gesture information of the object, wherein the specific calculation method comprises the following steps:
x=u′×fx÷z′+Cx
y=v′×fy÷z′+Cy
Wherein u, v, z respectively represent x, y, z axis coordinates under 3D coordinates, R represents a rotation matrix, T represents a translation matrix, fx, fy respectively represent focal lengths on x and y axes of the camera, cx, cy represents a camera principal point position, camera parameters are all in units of pixels, and x, y represent coordinates under 2D images;
the specific implementation of step 2 comprises the sub-steps of,
Step 2.1, extracting the characteristics of the image through modular convolution to obtain a characteristic image, and laying a foundation for subsequent characteristic point detection;
Step 2.2, further feature extraction is carried out on the premise that the size channel is unchanged by adopting an attention mechanism module, three modularized convolutions are continuously carried out after the feature extraction, and the attention mechanism module is used for extracting the feature again;
step 2.3, the dimension of the obtained feature map is reduced through a semantic embedding module, and the probability distribution information of the feature points is mapped to n heat maps, wherein n is the number of scales multiplied by 8;
a specific implementation of step 3 comprises the sub-steps of,
Step 3.1, carrying out convolution operation of 3x3 on the Gaussian heat map to enable the pixel points to have information of adjacent points;
step 3.2, performing non-maximum suppression on the obtained Gaussian heat map, and converting probability distribution of the feature points into specific coordinate points;
a specific implementation of step 4 comprises the sub-steps of,
Step 4.1, adjusting the positions of the characteristic points through the parallel relation among the multi-scale surrounding frame line segments, and reducing the errors of the neural network;
The line segments in the 4.2,3D model and the line segments in the 2D image have the same proportional relationship, and feature points are expanded through the equal proportional relationship between the 2D-3D line segments, so that the influence caused by precision loss is reduced;
And 4.3, randomly sampling the feature point set obtained after expansion, solving the pose by a PnP algorithm every time when n feature points are sampled, executing the process for m times, solving Euclidean distances between the m pose results and other results, and selecting the result with the smallest distance as the final result.
Aiming at the requirements of the augmented reality and the collaborative robot technology on the six-degree-of-freedom gesture information of the space target, the invention provides and realizes the six-degree-of-freedom gesture estimation method of the space target based on the depth full convolution network by combining the characteristics of the multi-scale bounding box of the target 3D model, and the method has the advantages of strong robustness, high accuracy and meeting the processing speed requirements, and can be popularized and applied in the scene of the collaborative robot capturing, the augmented reality and the like.
Drawings
FIG. 1 is a flow chart of the attitude estimation of the present invention.
Fig. 2 is a schematic view of camera calibration and bounding box calculation and projection.
Fig. 3 is a diagram of a neural network.
Fig. 4 is a block diagram of a convolution.
Fig. 5 is a block diagram of an attention mechanism module.
Fig. 6 is a structural diagram of a semantic embedding module.
Fig. 7 is a schematic diagram of a modified EPnP algorithm.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention; all other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The steps of pose estimation in the implementation example are shown in fig. 1, and the implementation example can be applied to spatial target six-degree-of-freedom pose estimation in the processes of augmented reality and mechanical arm grabbing, and the implementation process is as follows:
Step 1, calibrating a camera to obtain a camera internal reference, calculating a 3D model of an object to obtain a multi-scale bounding box of the object, and mapping the multi-scale bounding box onto a 2D image through the camera internal reference matrix 101;
In the example, accurate camera internal parameters are obtained through a chessboard calibration method, a multi-scale boundary frame of a 3D model is calculated, and the boundary frame of the 3D model is mapped onto a 2D image through internal parameters and gesture information, so that the labeling of a data set is completed. The method comprises the following steps:
Step 1.1, printing a standard 9x7 calibration chessboard 202, and calibrating a camera 201 to obtain an internal reference matrix 203;
step 1.2, calculating a multi-scale bounding box 205 of the 3D model 204, firstly reading point cloud information of the 3D model, and calculating maximum and minimum values of the 3D model in x, y and z axes, wherein the coordinates of the bounding box in normal scale are respectively as follows:
(xmax,ymax,zmax),(xmax,ymax,zmin),(xmax,ymin,zmax),(xmax,ymin,zmin),(xmin,ymax,zmax),(xmin,ymax,zmin),(xmin,ymjn,zmax),(xmin,ymin,zmin), Calculating the maximum and minimum difference values and the middle point of the three axes, and calculating the coordinates of the boundary frame with the scale, wherein the calculating method comprises the following steps:
x=(xmid±lengthx×scale_factor)
y=(ymid±lengthy×scale_factor)
z=(zmid±lengthz×scale-factor)
step 1.3, mapping the multi-scale bounding box 205 of the 3D model into the 2D image 207 in a corresponding gesture 206 by using an imaging principle, and completing the labeling of the data set.
Step 2, taking the component points of the multi-scale bounding box as characteristic points, training a full convolution neural network to detect and locate the characteristic points, taking RGB images as input by the network, and outputting a Gaussian heat map 102 about the characteristic points;
The specific implementation method of the step2 is as follows:
Step 2.1 inputs an RGB image 301 of size 640x480 to the first modular convolution 302, which outputs an 8-channel signature of size 320x240 and a 16-channel signature of 160x 120. As shown in fig. 4, the modularized convolution specific structure is that the feature map 401 performs normal convolution 402 and 1x1 convolution 405, and the feature map output by the feature map 402 performs hole convolution 403 once and performs addition operation with the output of the convolution 405, and performs pooling operation 404 and downsampling convolution 405, and outputs a feature map with the size reduced by half the number of channels doubled and a feature map with the size reduced by 4 times the number of 1/4 channels.
Step 2.2 the first attention mechanism module 303 performs feature extraction with unchanged size channels on the eight-channel feature map, and performs three times of modular convolution operations 304, 305, and 306 on the output feature map, the final feature map is 40x30x78, and the second attention mechanism module 307 performs feature extraction with unchanged size channels for the second time. The attention mechanism module is shown in fig. 5, the input features 501 are input to the shared full-connection layers 504, 505 and 506 after being subjected to mean pooling 502 and maximum pooling 503 to obtain two groups of features 507 and 508, channel attention weights 510 are obtained through Sigmoid509 after addition, further mean pooling 511 is performed, maximum pooling 512 is performed, spatial attention weights 514 are obtained through Sigmoid513 after the concatate operation, and the input features 501, the channel attention weights 510 and the spatial attention weights 514 are multiplied to obtain a final result 515.
The feature map output in step 2.3307 will be restored to the 640x480 size by the three consecutive semantic embedding modules 308, 309, 310 and output as the gaussian heat map 311. As shown in fig. 6, the semantic embedding module structures are 601 and 602 respectively a high resolution feature and a low resolution feature, the low resolution feature has the same resolution as 601 through a 3x3 convolution 604 and a bilinear interpolation 605, and the high resolution feature 601 is multiplied by the feature output by the 1x1 convolution 603 and 605 to obtain a final output feature 606.
Step 3, performing non-maximum suppression on the Gaussian heat map output by the neural network to obtain specific feature point two-dimensional coordinates 103;
the specific implementation method of the step 3 is as follows:
And 3.1, carrying out Gaussian kernel convolution operation of 3x3 on the Gaussian heat map, so that the pixel points have information of adjacent points.
And 3.2, performing non-maximum suppression on the obtained Gaussian heat map, namely, after applying 3x3 maximum pooling on the heat map, selecting the maximum point coordinate in the heat map as the characteristic point coordinate.
And 4, restoring the corresponding relation of the 2D-3D characteristic points into a six-degree-of-freedom gesture of the space target through an improved EPnP algorithm, and further providing a foundation 104 for the follow-up grabbing work.
The step 4 implementation mode comprises the following steps:
In step 4.1, each scale bounding box consists of 8 points and 12 sides, and the parallel relation of mapping of a line parallel to a 2D image in a 3D space is theoretically kept unchanged, so that the inner products (0 is the inner product of two parallel sides) of the same group under each scale are calculated, the error of a neural network is reduced by the slope of the side with the larger inner product, and the adjustment effect is shown as 701.
And step 4.2,3D, the line segments in the model and the line segments in the 2D image have the same proportional relationship, and feature points are expanded through the equal proportional relationship between the 2D-3D line segments, so that the influence caused by precision loss is reduced. 702 is two points of the line segment before expansion, and 703 is a set of points of the line segment after expansion.
And 4.3, randomly sampling the feature point set obtained after expansion, solving the pose through a PnP algorithm by sampling n feature points each time, and executing the process m times. And solving Euclidean distance between the m pose results and other results, and selecting the result with the smallest distance as a final result. All randomly sampled poses are shown as 704, resulting in a result 705 by minimizing the euclidean distance.
Thus, the execution of the one-time complete six-degree-of-freedom attitude estimation process of the space target is completed.
The experiment is run on an RTX3090 display card, the system environment is Windows10, the software environment is python3.7+CUDA11.1+pytorch1.9.1+ CUDNN 8.0.0, the used experimental data set is LINEMOD data set, all picture samples are stored in jpg format, depth images are stored in png format, model information is stored in ply format, surrounding frame information is stored in npy format, the network learning rate is 0.0003, and training of one sample is performed in each iteration. The experimental results are shown in table 1:
Table 1 shows experimental detection indexes and results, 2Dprojection represents the error of projection of the predicted gesture on 2D, ADD (-S) represents the proportion of the minimum weighted average distance less than 10% of the radius of the model, and 5cm5 represents the ratio of the predicted gesture to the true gesture error less than 5cm5 °:
TABLE 1
From the experimental detection index, according to different performance indexes, no matter 2Dprojection, ADD (-S) or 5cm < 5 >, the method has a good estimation effect on different targets, particularly, the method has a large improvement on smaller models (cat, duck and the like), and the accuracy is improved from 40.6 to 76.0 on the 5cm < 5 > index, and the processing speed of RGB images can reach 15fps. The method has higher detection accuracy, breaks through the limitation of the traditional method on the attitude estimation of the small model, and ensures the processing speed.
In summary, aiming at the requirements of the augmented reality and the collaborative robot technology on the six-degree-of-freedom gesture information of the space target, the method is provided and realized based on the depth full convolution network by combining the characteristics of the multi-scale bounding box of the target 3D model, and the method is high in robustness, high in accuracy and capable of meeting the processing speed requirements, and can be popularized and applied in the scene of grabbing and augmented reality of the collaborative robot.

Claims (1)

1. The method for estimating the spatial target posture facing to robot grabbing and augmented reality is characterized by comprising the following steps of:
step 1, calibrating a camera to obtain a camera internal reference, calculating a 3D model of an object, obtaining a multi-scale bounding box of the object, and mapping the multi-scale bounding box onto a 2D image through the camera internal reference matrix;
Step 2, taking the component points of the multi-scale bounding box as characteristic points, training a full convolution neural network to detect and locate the characteristic points, taking RGB images as input by the network, and outputting a Gaussian heat map about the characteristic points;
Step 3, performing non-maximum suppression on the Gaussian heat map output by the neural network to obtain specific characteristic point two-dimensional coordinates;
Step 4, restoring the corresponding relation of the 2D-3D characteristic points into a six-degree-of-freedom gesture of the space target through an improved EPnP algorithm, so as to provide a basis for subsequent grabbing work;
Step1 comprises the sub-steps of,
Step 1.1, obtaining RGB camera internal parameters through chessboard calibration;
Step 1.2, calculating maximum values and minimum values of x, y and z axes of the 3D model of the object under the space coordinates, so as to obtain a boundary frame of the common size of the object; calculating average value points of maximum values and minimum values of x, y and z axes, and multiplying the lengths on the axes by coefficients to obtain bounding boxes of different scales of the 3D model;
step 1.3, projecting a 3D multi-scale boundary frame of an object to pictures of different scenes through an internal reference matrix of a camera and gesture information of the object, wherein the specific calculation method comprises the following steps:
x=u′×fx÷z′+Cx
y=v′×fy÷z′+Cy
Wherein u, v, z respectively represent x, y, z axis coordinates under 3D coordinates, R represents a rotation matrix, T represents a translation matrix, fx, fy respectively represent focal lengths on x and y axes of the camera, cx, cy represents a camera principal point position, camera parameters are all in units of pixels, and x, y represent coordinates under 2D images;
Step 2 comprises the sub-steps of,
Step 2.1, extracting the characteristics of the image through modular convolution to obtain a characteristic image, and laying a foundation for subsequent characteristic point detection;
Step 2.2, further feature extraction is carried out on the premise that the size channel is unchanged by adopting an attention mechanism module, three modularized convolutions are continuously carried out after the feature extraction, and the attention mechanism module is used for extracting the feature again;
step 2.3, the dimension of the obtained feature map is reduced through a semantic embedding module, and the probability distribution information of the feature points is mapped to n heat maps, wherein n is the number of scales multiplied by 8;
step 3 comprises the sub-steps of,
Step 3.1, carrying out convolution operation of 3x3 on the Gaussian heat map to enable the pixel points to have information of adjacent points;
step 3.2, performing non-maximum suppression on the obtained Gaussian heat map, and converting probability distribution of the feature points into specific coordinate points;
step 4 comprises the sub-steps of,
Step 4.1, adjusting the positions of the characteristic points through the parallel relation among the multi-scale surrounding frame line segments, and reducing the errors of the neural network;
The line segments in the 4.2,3D model and the line segments in the 2D image have the same proportional relationship, and feature points are expanded through the equal proportional relationship between the 2D-3D line segments, so that the influence caused by precision loss is reduced;
and 4.3, randomly sampling the feature point set obtained after expansion, solving the pose by a PnP algorithm every time by sampling n feature points, executing the process for m times, solving Euclidean distances between the m pose results and other results, and selecting the result with the minimum distance as a final result.
CN202210422447.2A 2022-04-21 2022-04-21 Robot grabbing and augmented reality-oriented space target attitude estimation method Active CN114972525B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210422447.2A CN114972525B (en) 2022-04-21 2022-04-21 Robot grabbing and augmented reality-oriented space target attitude estimation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210422447.2A CN114972525B (en) 2022-04-21 2022-04-21 Robot grabbing and augmented reality-oriented space target attitude estimation method

Publications (2)

Publication Number Publication Date
CN114972525A CN114972525A (en) 2022-08-30
CN114972525B true CN114972525B (en) 2024-05-14

Family

ID=82979162

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210422447.2A Active CN114972525B (en) 2022-04-21 2022-04-21 Robot grabbing and augmented reality-oriented space target attitude estimation method

Country Status (1)

Country Link
CN (1) CN114972525B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021244079A1 (en) * 2020-06-02 2021-12-09 苏州科技大学 Method for detecting image target in smart home environment
CN113858217A (en) * 2021-12-01 2021-12-31 常州唯实智能物联创新中心有限公司 Multi-robot interaction three-dimensional visual pose perception method and system
CN113927597A (en) * 2021-10-21 2022-01-14 燕山大学 Robot connecting piece six-degree-of-freedom pose estimation system based on deep learning
WO2022061673A1 (en) * 2020-09-24 2022-03-31 西门子(中国)有限公司 Calibration method and device for robot

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2017203904A1 (en) * 2016-06-15 2018-01-18 Dotty Digital Pty Ltd A system, device, or method for collaborative augmented reality

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021244079A1 (en) * 2020-06-02 2021-12-09 苏州科技大学 Method for detecting image target in smart home environment
WO2022061673A1 (en) * 2020-09-24 2022-03-31 西门子(中国)有限公司 Calibration method and device for robot
CN113927597A (en) * 2021-10-21 2022-01-14 燕山大学 Robot connecting piece six-degree-of-freedom pose estimation system based on deep learning
CN113858217A (en) * 2021-12-01 2021-12-31 常州唯实智能物联创新中心有限公司 Multi-robot interaction three-dimensional visual pose perception method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于深度学习的机器人最优抓取姿态检测方法;李秀智;李家豪;张祥银;彭小彬;;仪器仪表学报;20201231(第05期);111-120 *
复杂场景下基于C-SHOT特征的3D物体识别与位姿估计;张凯霖;张良;;计算机辅助设计与图形学学报;20170515(第05期);59-66 *

Also Published As

Publication number Publication date
CN114972525A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
CN111563923B (en) Method for obtaining dense depth map and related device
CN111639663B (en) Multi-sensor data fusion method
CN110853075B (en) Visual tracking positioning method based on dense point cloud and synthetic view
TW202117611A (en) Computer vision training system and method for training computer vision system
CN112435223B (en) Target detection method, device and storage medium
CN112598735A (en) Single-image object pose estimation method fusing three-dimensional model information
CN113379815A (en) Three-dimensional reconstruction method and device based on RGB camera and laser sensor and server
CN112767486A (en) Monocular 6D attitude estimation method and device based on deep convolutional neural network
CN116092178A (en) Gesture recognition and tracking method and system for mobile terminal
KR100362171B1 (en) Apparatus, method and computer readable medium for computing a transform matrix using image feature point matching technique, and apparatus, method and computer readable medium for generating mosaic image using the transform matrix
CN113327295A (en) Robot rapid grabbing method based on cascade full convolution neural network
JP6016242B2 (en) Viewpoint estimation apparatus and classifier learning method thereof
CN114972525B (en) Robot grabbing and augmented reality-oriented space target attitude estimation method
Morales et al. Real-time adaptive obstacle detection based on an image database
Zhang et al. Data association between event streams and intensity frames under diverse baselines
KR101673144B1 (en) Stereoscopic image registration method based on a partial linear method
CN116342698A (en) Industrial part 6D pose estimation method based on incomplete geometric completion
CN116468731A (en) Point cloud semantic segmentation method based on cross-modal Transformer
CN114119999B (en) Iterative 6D pose estimation method and device based on deep learning
CN115984592A (en) Point-line fusion feature matching method based on SuperPoint + SuperGlue
CN115496859A (en) Three-dimensional scene motion trend estimation method based on scattered point cloud cross attention learning
CN116152334A (en) Image processing method and related equipment
CN114387351A (en) Monocular vision calibration method and computer readable storage medium
CN114972451A (en) Rotation-invariant SuperGlue matching-based remote sensing image registration method
CN113012298A (en) Curved MARK three-dimensional registration augmented reality method based on region detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant