CN116342698A - Industrial part 6D pose estimation method based on incomplete geometric completion - Google Patents

Industrial part 6D pose estimation method based on incomplete geometric completion Download PDF

Info

Publication number
CN116342698A
CN116342698A CN202310064596.0A CN202310064596A CN116342698A CN 116342698 A CN116342698 A CN 116342698A CN 202310064596 A CN202310064596 A CN 202310064596A CN 116342698 A CN116342698 A CN 116342698A
Authority
CN
China
Prior art keywords
rgb
point cloud
feature
original
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310064596.0A
Other languages
Chinese (zh)
Inventor
刘达新
王祺德
刘振宇
许嘉通
谭建荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202310064596.0A priority Critical patent/CN116342698A/en
Publication of CN116342698A publication Critical patent/CN116342698A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30164Workpiece; Machine component
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an industrial part 6D pose estimation method based on incomplete geometric completion. Acquiring an original RGB-D image containing a target industrial part, comprising an RGB image and an original point cloud converted from a depth map, and performing segmentation and positioning; recovering the complete geometric structure from the original target point cloud containing the incomplete geometric structure through a point cloud complement network to obtain a complement target point cloud; cutting an image, an original target point cloud and a complement target point cloud from a target ROI respectively, extracting features, and performing multi-scale stitching fusion to obtain enhanced RGB-D features; the 6D pose of the target industrial part is regressed in a confidence manner. Aiming at the problem of geometric incomplete of depth information caused by various complex factors of industrial parts in an actual scene, the invention realizes accurate and robust 6D pose estimation by complementing incomplete geometry and obtaining enhanced RGB-D characteristics, and has better engineering practical value.

Description

Industrial part 6D pose estimation method based on incomplete geometric completion
Technical Field
The invention relates to a pose estimation method in the technical field of computer vision and robots, in particular to a 6D pose estimation method of industrial parts based on incomplete geometric completion.
Background
The 6-degree-of-freedom (6D) pose of the target industrial part in the three-dimensional space is estimated by utilizing the visual sensor, and the 6-degree-of-freedom (6D) pose comprises a translation vector with three degrees of freedom and a rotation matrix with three degrees of freedom, is a basic computer vision technology and is widely used for various downstream robot operation tasks in actual industrial application scenes, such as robot assembly, obstacle avoidance planning, robot digital twin and the like.
The traditional method relies on various image features designed manually, solves the pose based on feature correspondence or voting, and has very limited solving accuracy, robustness and flexibility. With the rapid development of deep learning technology and RGB-D cameras, data-driven RGB-D based methods gradually achieve better pose estimation performance and are receiving attention from industry. Common industrial part targets often lack effective surface texture, and such methods perform better than RGB-only methods because the depth map in the RGB-D image provides complementary geometric information.
However, in practical applications, the three-dimensional data obtained from depth sensors is often incomplete and contains a lot of noise, and possible occlusion may further exacerbate this problem, especially for reflective metal surfaces of industrial parts. In this case, some visible regions in the image may be missing in the corresponding depth map, and incomplete three-dimensional data is insufficient to provide effective geometric information to match the two-dimensional image for accurate prediction inference, resulting in reduced performance of estimating the pose of the industrial part 6D.
Disclosure of Invention
In order to solve the problems, the invention provides an industrial part 6D pose estimation method based on incomplete geometric completion, which utilizes a deep learning technology to realize industrial part 6D pose estimation under incomplete geometric information from RGB-D images, and effectively improves the accuracy and robustness of pose estimation.
The specific technical scheme of the invention is as follows:
s1: shooting an original RGB-D image containing a target industrial part by using an RGB-D camera, wherein the original RGB-D image comprises an RGB image and a depth map, and converting the depth map into a corresponding original point cloud;
s2: dividing and positioning a target industrial part, dividing an instance Mask and an ROI clipping image of the target industrial part from an RGB image by adopting a Mask R-CNN method, and dividing an original target point cloud corresponding to the instance Mask of the target industrial part from the original point cloud according to the alignment relation between pixels of the RGB image and pixels of a depth image;
s3: recovering the complete geometric structure of the target industrial part from the original target point cloud containing the incomplete geometric structure through a point cloud complement network to obtain a corresponding complement target point cloud;
s4: RGB features are extracted from the ROI clipping image obtained in the step S2 respectively, original geometric features are extracted from the original target point cloud obtained in the step S2, and multi-scale geometric features are extracted from the complement target point cloud obtained in the step S3;
s5: performing multi-scale splicing and fusion on all the features obtained in the step S4 to obtain enhanced RGB-D features so as to enhance the original information of the RGB-D image;
s6: and (3) carrying out regression evaluation on the accurate 6D pose of the target industrial part from the enhanced RGB-D characteristics in a confidence scoring mode through supervision training.
In the step S1, the target industrial part area in the original point cloud may be incomplete and contain a lot of noise due to factors such as reflective metal surface, partial occlusion, and depth sensor noise of the RGB-D camera, which may exist in the target industrial part.
In the step S3, the point cloud completion network mainly comprises an encoder and a decoder, and the point cloud completion is realized by predicting mapping from the incomplete geometric space to the complete geometric space, specifically:
the encoder includes a dual-layered stacked PointNet network, and the decoder includes an MLP network;
firstly, the encoder takes the original target point cloud containing the incomplete geometric structure obtained in the step S2 as input, and extracts an intermediate feature vector from the original target point cloud by using a double-layer stacked PointNet network;
then, the decoder takes the intermediate feature vector as input and combines the MLP network to generate dense complement point cloud in a staged way from coarse to fine;
and finally, the dense complement point cloud is subjected to shape protection network layer adjustment processing, and the final complement point cloud is obtained through output, so that local shape distortion can be avoided.
The adjustment process of the shape protection network layer is as follows:
Figure SMS_1
Figure SMS_2
Figure SMS_3
wherein p is (i) Complementing the points in the point cloud densely;
Figure SMS_4
for the distance point p in the original target point cloud (i) The nearest two points; />
Figure SMS_5
For virtual target points, for guiding the point p (i) Is used for adjusting the direction; d is virtual target point +.>
Figure SMS_6
And point p (i) The distance between them; />
Figure SMS_7
The complement target point cloud point of the final adjustment output is obtained; sigma is a parameter of adjustment that can be learned, e represents the natural base number of the product, I 2 Representing 2 norms, i representing the index of any point.
The step S4 specifically includes the following steps:
s41: RGB feature extraction from ROI cropped images using CNN networks
Figure SMS_8
Wherein H, W is the height and width of the ROI cropped image, d rgb For the dimension of RGB feature->
Figure SMS_9
Representing a real set;
s42: extracting original geometric features from original target point cloud using MLP network
Figure SMS_10
Where N is the number of points in the original target point cloud, d p Is the dimension of the original geometric feature;
s43: the multiscale geometrical features extracted from the complement target point cloud comprise region-level features F r And global feature F g
Regional level feature F r Is obtained by aggregating the full target point cloud points in the range nearby each original target point cloud point in an MLP network and average pooling mode,
Figure SMS_11
wherein d is r The dimension of the regional level features;
global feature F g Is obtained by extracting and processing from the complement target point cloud through the PointNet++ network,
Figure SMS_12
wherein d is g Is the dimension of the global feature.
The step S5 specifically includes the following steps:
s51: pixels and depths from RGB imagesAlignment relationship between pixels of the degree map for RGB feature F rgb Downsampling is performed and then the original geometric feature F is obtained p Splicing is carried out at the corresponding position;
s52: to the original geometric feature F p And corresponding regional level feature F r Splicing; due to the original geometrical features F p And region level feature F r Is in one-to-one correspondence, so that the two are spliced correspondingly.
S53: global feature F g Copying N-1 times and then copying with global feature F g Fusing the two to make the global feature F g Expanding to expand global features
Figure SMS_13
Then the global feature +.>
Figure SMS_14
And region level feature F r Splicing;
s54: by the splicing operation of the steps S51-S53, RGB feature F is realized rgb Original geometric feature F p Regional level feature F r Global feature F g Is spliced and fused in multiple scales to obtain the initial RGB-D characteristic after fusion
Figure SMS_15
And d f =d rgb +d p +d r +d g Wherein d is f D, the dimension of the initial RGB-D feature rgb Dimension d of RGB feature p D is the dimension of the original geometric feature r Dimension d is regional level feature g Is a dimension of the global feature;
finally, initial RGB-D feature F rgb-d Feature enhancement is carried out through an MLP network to obtain final enhanced RGB-D features
Figure SMS_16
Wherein d' f To enhance the dimension of the RGB-D feature.
In the step S6, N6D poses are predicted from the enhanced RGB-D features in a regression mode, the confidence coefficient of each predicted pose is obtained, and then the predicted pose with the highest confidence coefficient is used as a final 6D pose estimation result of the target industrial part.
The invention has the beneficial effects that:
(1) The invention realizes accurate 6D pose estimation of the industrial part, benefits from the supplementary geometric information provided by the RGB-D image, and solves the problem of reduced pose estimation performance of the traditional method for the weak texture industrial part;
(2) The invention solves the problem of geometric incomplete of depth information caused by various complex factors such as sensor noise, metal surface reflection, shielding and the like by complementing incomplete geometric information in the original target point cloud, and further improves the robustness of pose estimation of industrial parts in actual industrial application scenes;
(3) The pose estimation method provided by the invention can be directly deployed in various downstream robot application tasks, and the flexibility and the intelligence of a robot system are effectively improved.
In summary, the industrial part pose estimation method provided by the invention aims at the low texture characteristics of industrial parts and the geometric defectivity of depth sensor information caused by various complex factors in actual industrial scenes, realizes accurate and robust 6D pose estimation by complementing the defected geometry and obtaining enhanced RGB-D characteristics, and has good engineering practical value.
Drawings
FIG. 1 is a flow chart of 6D pose estimation of an industrial part based on incomplete geometric completion in the present invention.
FIG. 2 is a schematic diagram of a target industrial part for method testing in an embodiment of the invention.
Fig. 3 is a schematic diagram of an adjustment process of the complement point cloud in the present invention.
FIG. 4 is a schematic diagram of a fusion process for enhancing RGB-D characteristics in the present invention.
Fig. 5 is a visual result diagram of pose estimation in an embodiment of the present invention.
FIG. 6 is a visual effect of incomplete geometric completion of a target industrial part in an embodiment of the invention.
Detailed Description
The invention is further illustrated by a self-built industrial part pose estimation dataset:
in this example, the method of the present invention was trained and tested on a self-built industrial part dataset, as shown in FIG. 1. The self-built industrial part data lump includes about 30000 data samples in total, covering 8 different industrial parts (see fig. 2) and a variety of challenging conditions such as occlusion, illumination transformation, depth loss, etc. Each data sample consists of an original RGB image, a depth map and a mask, a category and a pose label of each target industrial part in the image. The 6D pose estimation flow of the industrial part is shown in figure 1. Aiming at the data set, the pose estimation method provided by the invention comprises the following specific steps:
s1: and shooting an original RGB-D image containing the target industrial part by using an RGB-D camera, wherein the original RGB-D image comprises an RGB image and a depth map, and converting the depth map into a corresponding original point cloud.
In the self-built industrial part dataset of the present embodiment, the target industrial part area in the original point cloud may be incomplete and contain a lot of noise due to the reflective metal surface, occlusion, and depth sensor noise of the RGB-D camera, etc. that may be present in the target industrial part.
S2: and (3) dividing and positioning the target industrial part, dividing an instance Mask and an ROI clipping image of the target industrial part from the RGB image by adopting a Mask R-CNN method, and dividing an original target point cloud corresponding to the instance Mask of the target industrial part from the original point cloud according to the alignment relation between RGB pixels and depth pixels.
S3: and recovering the complete geometric structure of the target industrial part from the original target point cloud containing the incomplete geometric structure through a point cloud complement network to obtain a corresponding complement target point cloud.
The point cloud completion network is composed of encoder-decoder structures, and the point cloud completion is realized by predicting mapping of incomplete geometric space to complete geometric space. Specifically, first, the encoder takes as input the original target point cloud containing the incomplete geometry in S2, and extracts an intermediate feature vector from it using the bi-layer stacked pointet network. The decoder then takes this intermediate feature vector as input and stages in a coarse to fine fashion in conjunction with the MLP network to generate a dense complement point cloud. Finally, to avoid local shape distortion, the dense complement point cloud is subjected to a shape protection network layer to adjust the output result (see fig. 3), so as to obtain a final complement point cloud, wherein the adjustment process is as follows:
Figure SMS_17
Figure SMS_18
Figure SMS_19
wherein p is (i) Complementing the points in the point cloud densely;
Figure SMS_20
for the distance p in the original target point cloud (i) The nearest two points; />
Figure SMS_21
Is a virtual target point for guiding p (i) Is used for adjusting the direction; d is->
Figure SMS_22
And p is as follows (i) The distance between them; />
Figure SMS_23
The complement target point cloud point of the final adjustment output is obtained; sigma is a learnable adjustment parameter, e represents a natural base, 2 representing 2 norms, i representing the index of any point.
In this embodiment, a series of depth point clouds of incomplete target industrial parts are pre-generated from different perspectives, and random occlusion and noise are added to these point clouds, so as to provide complementary synthetic training data for training of the point cloud completion network.
S4: RGB features are extracted from the ROI cropping image in S2, original geometric features are extracted from the original target point cloud in S2, and multi-scale geometric features are extracted from the complement target point cloud in S3, respectively.
Wherein RGB features are extracted from ROI cropped images using CNN network
Figure SMS_24
Wherein H, W is the ROI cropped image height and width, d rgb For the dimension of RGB feature->
Figure SMS_25
Representing a real set; extracting original geometric feature +.>
Figure SMS_26
Where N is the number of points in the original target point cloud, d p Is the dimension of the original geometric feature; the multiscale geometrical features extracted from the complement target point cloud comprise regional level features and global features, wherein the regional level features are +.>
Figure SMS_27
The method comprises the steps of aggregating the complement target point cloud points in a certain range near each original target point cloud point through an MLP network and an average pooling mode, wherein d r For the dimension of the regional level feature, global feature +.>
Figure SMS_28
Extracted from the complement target point cloud through the PointNet++ network, wherein d g Is the dimension of the global feature.
In this embodiment, take d rgb =128,d p =128,d r =256,d g =512
S5: and (3) performing multi-scale splicing and fusion on all the features obtained in the step (S4) to enhance the original information of the RGB-D image and obtain enhanced RGB-D features.
As shown in fig. 4, first, according to the alignment relationship between RGB pixels and depth pixels, the pixel pairsRGB feature F rgb Downsampling is performed and then the original geometric feature F is obtained p And splicing at the corresponding positions. Due to the original geometrical features F p And region level feature F r Is in one-to-one correspondence, so that the two are spliced correspondingly. Then, the global feature F g Copying for N-1 times and then with F g Fusing the two to make the global feature F g Expanded to
Figure SMS_29
And associate it with regional level feature F r And (5) splicing. Through the splicing operation, RGB feature F is realized rgb Original geometric feature F p Regional level feature F r Global feature F g Is fused by multi-scale splicing to obtain the initial RGB-D characteristic after fusion>
Figure SMS_30
Wherein d is f =d rgb +d p +d r +d g Is the dimension of the initial RGB-D feature. Finally, the initial RGB-D characteristic is enhanced through an MLP network to obtain the final enhanced RGB-D characteristic>
Figure SMS_31
Wherein d' f To enhance the dimension of the RGB-D feature.
In this example, d 'is taken' f =128。
S6: and (3) regressing the accurate 6D pose of the target industrial part from the enhanced RGB-D characteristics in a confidence scoring mode through supervision training. Firstly, predicting N6D poses from the enhanced RGB-D features in a regression way, solving the confidence coefficient of each predicted pose, and taking the predicted pose with the highest confidence coefficient score as the final 6D pose estimation result of the target industrial part.
In this example, 70% of the data from the built industrial part dataset was taken and model training was performed based on the Injettia A100 graphics processor and Pytorch framework.
After model training is completed, 30% of the data from the data set of the built industrial part is tested.
The quantitative test results are shown in table 1. The accuracy of the pose estimation result is measured by taking the ratio of the average model distance smaller than 0.02m (ADD is smaller than or equal to 0.02) as an evaluation index. It can be seen that the average success rate of pose estimation of the method reaches more than 90%.
TABLE 1 success rate of pose estimation of the inventive method (%)
Target part Part 1 Part 2 Part 3 Part 4 Part 5 Part 6 Part 7 Part 8 Average of
Success rate 98.1 93.0 84.8 97.6 97.4 78.9 98.7 99.9 93.6
The qualitative test results are shown in fig. 5. The target industrial part is projected to an image plane in the estimated pose, and the accuracy of the estimated result is approximately measured. The method can successfully estimate the 6D pose of the target industrial part, and further verifies the effectiveness of the method. In addition, some incomplete geometric complementation effects are shown in FIG. 6.
The present invention is not limited to the above embodiments, but is merely used to help understand the method and core idea of the present invention. It should be noted that it would be obvious to those skilled in the art that various improvements and modifications can be made to the present application without departing from the principles of the present invention, and such improvements and modifications fall within the scope of the claims of the present application. What is not described in detail in this specification is prior art known to those skilled in the art.

Claims (7)

1. The industrial part 6D pose estimation method based on incomplete geometric completion is characterized by comprising the following steps of:
s1: shooting an original RGB-D image containing a target industrial part by using an RGB-D camera, wherein the original RGB-D image comprises an RGB image and a depth map, and converting the depth map into a corresponding original point cloud;
s2: dividing and positioning a target industrial part, dividing an instance Mask and an ROI clipping image of the target industrial part from an RGB image by adopting a Mask R-CNN method, and dividing an original target point cloud corresponding to the instance Mask of the target industrial part from the original point cloud according to the alignment relation between pixels of the RGB image and pixels of a depth image;
s3: recovering the complete geometric structure of the target industrial part from the original target point cloud containing the incomplete geometric structure through a point cloud complement network to obtain a corresponding complement target point cloud;
s4: RGB features are extracted from the ROI clipping image obtained in the step S2 respectively, original geometric features are extracted from the original target point cloud obtained in the step S2, and multi-scale geometric features are extracted from the complement target point cloud obtained in the step S3;
s5: performing multi-scale splicing and fusion on all the features obtained in the step S4 to obtain enhanced RGB-D features so as to enhance the original information of the RGB-D image;
s6: and (3) carrying out regression evaluation on the 6D pose of the target industrial part from the enhanced RGB-D characteristics in a confidence manner through supervision training.
2. The method for estimating 6D pose of industrial part based on incomplete geometric completion according to claim 1, wherein in step S1, the target industrial part area in the original point cloud may be incomplete and contain noise.
3. The industrial part 6D pose estimation method based on incomplete geometry completion according to claim 1, wherein in the step S3, the point cloud completion network mainly comprises an encoder and a decoder, and the point cloud completion is realized by predicting mapping from the incomplete geometry space to the complete geometry space, specifically:
the encoder includes a dual-layered stacked PointNet network, and the decoder includes an MLP network;
firstly, the encoder takes the original target point cloud containing the incomplete geometric structure obtained in the step S2 as input, and extracts an intermediate feature vector from the original target point cloud by using a double-layer stacked PointNet network;
then, the decoder takes the intermediate feature vector as input and combines the MLP network to generate dense complement point cloud in a staged way from coarse to fine;
and finally, the dense complement point cloud is subjected to shape protection network layer adjustment processing, and the final complement point cloud is obtained through output.
4. The industrial part 6D pose estimation method based on incomplete geometric completion according to claim 3, wherein the method is characterized by comprising the following steps of:
the adjustment process of the shape protection network layer is as follows:
Figure FDA0004061772000000021
Figure FDA0004061772000000022
Figure FDA0004061772000000023
wherein p is (i) Complementing the points in the point cloud densely;
Figure FDA0004061772000000024
for the distance point p in the original target point cloud (i) The nearest two points; />
Figure FDA0004061772000000025
For virtual target points, for guiding the point p (i) Is used for adjusting the direction; d is virtual target point +.>
Figure FDA0004061772000000026
And point p (i) The distance between them;
Figure FDA0004061772000000027
the complement target point cloud point of the final adjustment output is obtained; sigma is a parameter of adjustment that can be learned, e represents the natural base number of the product, I 2 Representing 2 norms, i representing the index of any point.
5. The method for estimating 6D pose of industrial parts based on incomplete geometric completion according to claim 1, wherein said step S4 specifically comprises the steps of:
s41: benefit (benefit)RGB feature extraction from ROI cropped images using CNN networks
Figure FDA0004061772000000028
Wherein H, W is the height and width of the ROI cropped image, d rgb For the dimension of RGB feature->
Figure FDA0004061772000000029
Representing a real set;
s42: extracting original geometric features from original target point cloud using MLP network
Figure FDA00040617720000000210
Where N is the number of points in the original target point cloud, d p Is the dimension of the original geometric feature;
s43: the multiscale geometrical features extracted from the complement target point cloud comprise region-level features F r And global feature F g
Regional level feature F r Is obtained by aggregating the full target point cloud points in the range beside each original target point cloud point in an MLP network and average pooling mode,
Figure FDA00040617720000000211
wherein d is r The dimension of the regional level features;
global feature F g Is obtained by extracting and processing from the complement target point cloud through the PointNet++ network,
Figure FDA00040617720000000212
wherein d is g Is the dimension of the global feature.
6. The method for estimating 6D pose of industrial parts based on incomplete geometric completion according to claim 1, wherein said step S5 specifically comprises the steps of:
s51: RGB feature F is aligned according to the alignment between the pixels of the RGB image and the pixels of the depth map rgb Downsampling is performed and then the original geometry is matchedFeature F p Splicing is carried out at the corresponding position;
s52: to the original geometric feature F p And corresponding regional level feature F r Splicing;
s53: global feature F g Copying N-1 times and then copying with global feature F g Fusing the two to make the global feature F g Expanding to expand global features
Figure FDA00040617720000000213
Then the global feature +.>
Figure FDA00040617720000000214
And region level feature F r Splicing;
s54: by the splicing operation of the steps S51-S53, RGB feature F is realized rgb Original geometric feature F p Regional level feature F r Global feature F g Is spliced and fused in multiple scales to obtain the initial RGB-D characteristic F after fusion rgb-d
Figure FDA0004061772000000031
And d f =d rgb +d p +d r +d g Wherein d is f D, the dimension of the initial RGB-D feature rgb Dimension d of RGB feature p D is the dimension of the original geometric feature r Dimension d is regional level feature g Is a dimension of the global feature;
finally, initial RGB-D feature F rgb-d Feature enhancement is carried out through an MLP network to obtain final enhanced RGB-D feature F' rgb-d
Figure FDA0004061772000000032
Wherein d' f To enhance the dimension of the RGB-D feature.
7. The method for estimating 6D pose of industrial part based on incomplete geometric completion according to claim 1, wherein in step S6, N6D poses are predicted from the enhanced RGB-D features by regression, the confidence of each predicted pose is obtained, and then the predicted pose with the highest confidence is used as the final 6D pose estimation result of the target industrial part.
CN202310064596.0A 2023-02-06 2023-02-06 Industrial part 6D pose estimation method based on incomplete geometric completion Pending CN116342698A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310064596.0A CN116342698A (en) 2023-02-06 2023-02-06 Industrial part 6D pose estimation method based on incomplete geometric completion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310064596.0A CN116342698A (en) 2023-02-06 2023-02-06 Industrial part 6D pose estimation method based on incomplete geometric completion

Publications (1)

Publication Number Publication Date
CN116342698A true CN116342698A (en) 2023-06-27

Family

ID=86890521

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310064596.0A Pending CN116342698A (en) 2023-02-06 2023-02-06 Industrial part 6D pose estimation method based on incomplete geometric completion

Country Status (1)

Country Link
CN (1) CN116342698A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455837A (en) * 2023-09-22 2024-01-26 苏州诺克汽车工程装备有限公司 High-reflection automobile part identification feeding method and system based on deep learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455837A (en) * 2023-09-22 2024-01-26 苏州诺克汽车工程装备有限公司 High-reflection automobile part identification feeding method and system based on deep learning

Similar Documents

Publication Publication Date Title
CN112270249B (en) Target pose estimation method integrating RGB-D visual characteristics
US11763485B1 (en) Deep learning based robot target recognition and motion detection method, storage medium and apparatus
CN109784333B (en) Three-dimensional target detection method and system based on point cloud weighted channel characteristics
US11763433B2 (en) Depth image generation method and device
CN110599537A (en) Mask R-CNN-based unmanned aerial vehicle image building area calculation method and system
CN108648194B (en) Three-dimensional target identification segmentation and pose measurement method and device based on CAD model
CN112750133A (en) Computer vision training system and method for training a computer vision system
JP7439153B2 (en) Lifted semantic graph embedding for omnidirectional location recognition
CN111768415A (en) Image instance segmentation method without quantization pooling
CN111429533A (en) Camera lens distortion parameter estimation device and method
CN112258658A (en) Augmented reality visualization method based on depth camera and application
CN113963117B (en) Multi-view three-dimensional reconstruction method and device based on variable convolution depth network
CN116342698A (en) Industrial part 6D pose estimation method based on incomplete geometric completion
CN117218343A (en) Semantic component attitude estimation method based on deep learning
CN114782417A (en) Real-time detection method for digital twin characteristics of fan based on edge enhanced image segmentation
CN116310128A (en) Dynamic environment monocular multi-object SLAM method based on instance segmentation and three-dimensional reconstruction
CN115546273A (en) Scene structure depth estimation method for indoor fisheye image
Yuan et al. Presim: A 3d photo-realistic environment simulator for visual ai
CN111179271B (en) Object angle information labeling method based on retrieval matching and electronic equipment
CN116958434A (en) Multi-view three-dimensional reconstruction method, measurement method and system
CN111709269B (en) Human hand segmentation method and device based on two-dimensional joint information in depth image
CN111160372B (en) Large target identification method based on high-speed convolutional neural network
CN116363168A (en) Remote sensing video target tracking method and system based on super-resolution network
CN107392936B (en) Target tracking method based on meanshift
CN115410014A (en) Self-supervision characteristic point matching method of fisheye image and storage medium thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination