CN112785526B - Three-dimensional point cloud restoration method for graphic processing - Google Patents
Three-dimensional point cloud restoration method for graphic processing Download PDFInfo
- Publication number
- CN112785526B CN112785526B CN202110116229.1A CN202110116229A CN112785526B CN 112785526 B CN112785526 B CN 112785526B CN 202110116229 A CN202110116229 A CN 202110116229A CN 112785526 B CN112785526 B CN 112785526B
- Authority
- CN
- China
- Prior art keywords
- point cloud
- point
- dimensional
- matrix
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000012545 processing Methods 0.000 title claims abstract description 9
- 239000013598 vector Substances 0.000 claims abstract description 57
- 238000000605 extraction Methods 0.000 claims abstract description 25
- 238000013507 mapping Methods 0.000 claims abstract description 22
- 239000011159 matrix material Substances 0.000 claims description 103
- 238000012360 testing method Methods 0.000 claims description 52
- 238000012549 training Methods 0.000 claims description 40
- 230000006870 function Effects 0.000 claims description 16
- 238000011176 pooling Methods 0.000 claims description 10
- 238000010586 diagram Methods 0.000 claims description 7
- 230000007246 mechanism Effects 0.000 abstract description 14
- 239000010410 layer Substances 0.000 description 30
- 230000008439 repair process Effects 0.000 description 13
- 230000001419 dependent effect Effects 0.000 description 6
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- 230000001537 neural effect Effects 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000002194 synthesizing effect Effects 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000010365 information processing Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- RTAQQCXQSZGOHL-UHFFFAOYSA-N Titanium Chemical group [Ti] RTAQQCXQSZGOHL-UHFFFAOYSA-N 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 239000011229 interlayer Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/77—Retouching; Inpainting; Scratch removal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/20—Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computer Graphics (AREA)
- Health & Medical Sciences (AREA)
- Architecture (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computer Hardware Design (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a three-dimensional point cloud restoration method for graphic processing, which comprises the following steps: step 1, collecting data of an input point cloud model data set; step 2, a method based on Self-Attention mechanism is combined with a multi-layer perceptron MLP to obtain a long-distance dependency relationship extraction network, the long-distance dependency relationship extraction network is used for mapping an input point cloud into a global feature vector, and then a decoder of a topological root tree structure is used for generating a missing part of an incomplete point cloud; and step 3, combining the incomplete point cloud and the generated missing part point cloud together to obtain a final repaired complete point cloud model.
Description
Technical Field
The invention belongs to the field of computer three-dimensional model processing and computer graphics, and particularly relates to a three-dimensional point cloud repairing method for graphics processing.
Background
In recent years, a large amount of three-dimensional data is directly acquired in the real world, which can be achieved by using a LiDAR scanner or a depth sensor such as Kinect, a stereo camera, and the like.
However, the 3D data obtained using these instruments is often incomplete, mainly for the following reasons: the scanning view angle of the scanner is limited, and the shielding of non-target objects and the influence of light refraction and reflection. Therefore, the loss of the geometric information and the semantic information of the target object is often caused. Thus, it is a very necessary study to investigate how to repair incomplete 3D models for more subsequent applications. In addition, 3D models also appear in a large number of representations such as point clouds, voxels, patches and distance fields. The use of point clouds to represent and process 3D data has received increasing attention because of its lower storage cost compared to other representations (e.g., 3D voxel grids), but it can represent 3D models more finely and finely. The advent of literature 1 C.R.Qi,H.Su,K.Mo,and L.J.Guibas.Pointnet: deep learning on point sets for 3d classification and segmentation.2018 allows unordered point sets to be processed directly, which greatly facilitates the development of deep learning architectures for processing point clouds, as well as the development of other related studies such as 3D scene reconstruction, 3D model segmentation, 3D model repair, and the like.
3D point cloud model repair works based on learning, document 2W.Yuan,T.Khot,D.Held,C.Mertz,and M.Hebert.PCN:Point Completion Network.International Conference on 3D Vision 2018, document 3Z.Huang, Y.Yu, J.xu, F.Ni, and X.le.PF-Net: point Fractal Network for 3D Point Cloud Completion.Conference on Computer Vision and Pattern Recognition 2020, document 4 W.Yuan,L.P.Tchapmi,S.H, rezatofighi, I.Reid, and S.Savaree.TopNet: structural Point Cloud decoder.conference on Computer Vision and Pattern Recognition 2019, etc., typically use a multi-layer perceptron (MLP) as its feature extractor, which takes an incomplete point cloud as input and maps each of its points into feature vectors of different dimensions, and extracts the maximum from the last feature vector to obtain global features. Meanwhile, since there is currently no relatively feasible method to define a local neighborhood of a point cloud, it is difficult to extract features through a convolution operation like a 2D image. Thus, these methods rely heavily on multiple fully connected layers with similar architecture to capture the features of the input model and the dependencies between different points in the input point cloud. In addition, PF-Net shows that the lower and middle layers in MLP often extract local information, and these local cannot be exploited to form global features simply by passing them to the higher layers using a shared fully connected layer. This means that the method cannot efficiently extract and embed enough long-range dependency information into the final global feature vector. Another problem is that even if limited long-range dependent information can be captured, it is often necessary to go through several fully connected layers to learn the information. This may be detrimental to the efficient capture of long range dependencies, mainly for several reasons:
(1) A more targeted model may be needed to represent these long-range dependencies;
(2) Optimization algorithms may be difficult to calculate certain parameter values that may be used to facilitate inter-layer coordination among multiple layers to capture these long-range dependencies.
(3) When these parameter settings are applied to a new model that the network has not seen, the parameter settings may exhibit vulnerability statistically.
In recent years, the Attention mechanism has been commonly combined with various methods (such as the current method and the GAN method) to capture long-range dependency information. It began first in the computer vision field and has evolved greatly in the Natural Language Processing (NLP) field. Document 5 V.Mnih,N.Heess,A.Graves,and K Kavukculoglu. Recurrent Models of Visual attention. Conference on Neural Information Processing Systems 2014 this mechanism is combined with the RNN method for image classification studies to obtain excellent performance. Document 6 D.Bahdanau,K.Cho,and Y.Bengio.Neural Machine Translation by Jointly Learning to Align and Translate.International Conference on Learning Representations 2015 the Attention mechanism is applied to NLP, i.e. it is used to translate and align simultaneously to complete the machine translation task. Self-intent allows the input elements in the collection to interact with each other to calculate weights or responses and find out which elements should be put more attention. Document 7 A.Vaswani,N.Shazeer,N Parmar, J.Uszkoreit, L.Jones, A.N.Gomez, and L.Kaiser. Attributes All You need Conference on Neural Information Processing Systems 2017, show that applying the Self-attribute mechanism to machine translation tasks achieves the best performance at the time. Document 8H.Zhang, I.GoodFe, D.Metaxas, J.Uszkoreit, and A.Odena.self-Attention Generative Adversarial networks.International Conference on Machine Learning 2019 the Self-attribute mechanism is integrated into the GAN framework, achieving the best performance in terms of class-conditional image generation on imageNet at the time.
Disclosure of Invention
The invention aims to: the invention aims to solve the technical problem of providing a three-dimensional point cloud restoration method for graphic processing aiming at the defects of the prior art, and particularly discloses a three-dimensional point cloud restoration method for extracting long-distance dependence based on a self-attention mechanism, which is used for restoring an incomplete 3D model and comprises the following steps:
step 1, inputting a point cloud model data set and collecting data;
step 2, a long-distance dependency relationship extraction network is obtained by combining a Self-Attention mechanism-based method and a multi-layer perceptron MLP, the long-distance dependency relationship extraction network is used for mapping an input point cloud into a total feature vector, and then a decoder of a topological root tree structure is used for generating a missing part of an incomplete point cloud;
and step 3, combining the incomplete point cloud and the generated missing part point cloud together to obtain a final repaired complete point cloud model.
Step 1 comprises the following steps:
step 1-1, setting and inputting a single three-dimensional point cloud model s, presetting 5 viewpoints which are respectively (1, 0), (0, 1), (1, 0, 1), (-1, 0), (-1, 0), and a plurality of different viewpoints to ensure that the missing part of the incomplete model has randomness when training and test data are acquired;
step 1-2, randomly selecting a viewpoint as a central point p, and presetting a radius r (the radius is set according to the removal point and is not a specific length in a mathematical sense, if the removal point is set to 25% of the origin cloud, then p is taken as the central point, and 25% of the closest points to the p point are removed);
step 1-3, for a three-dimensional point cloud model s, taking a randomly selected viewpoint p as a center, and removing points within a preset radius r to obtain an incomplete point cloud model; the removed set of points is the missing part point cloud corresponding to the incomplete point cloud model.
Step 2 comprises the following steps:
step 2-1, the input three-dimensional point cloud model data set S= { S Train ,S Test Dividing into training sets S Train ={s 1 ,s 2 ,...s i ,...,s n Sum test set S Test ={s n+1 ,s n+2 ,...,s n+j ,...,s n+m (s is therein i Representing an ith three-dimensional point cloud model in a training set, s n+j Representing a jth three-dimensional point cloud model in the test set; i is 1-n, j is 1-m;
step 2-2, for training set S Train Collecting incomplete point cloud models P of each three-dimensional point cloud model under random view points Train ={p 1 ,p 2 ,...p i ,...,p n ' and corresponding missing part point cloud model G Train ={g 1 ,g 2 ,...g i ,...,g n Training as input to the whole network to obtain a trained long-range dependency extraction network and decoder of topological root tree structure, wherein p i Refers to training set S Train I-th three-dimensional point cloud model s in (3) i Corresponding incomplete point cloud model g i Refers to training set S Train I-th three-dimensional point cloud model s in (3) i A corresponding missing part point cloud model;
step 2-3, for test set S Test Collecting incomplete point cloud models P of each three-dimensional point cloud model under random view points Test ={p n+1 ,p n+2 ,...,p n+j ,...,p n+m Inputting into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p is as follows n+j Refers to test set S Test The j-th three-dimensional point cloud model s in (3) n+j And a corresponding incomplete point cloud model.
Step 2-2 includes the steps of:
step 2-2-1, training set S Train Incomplete point cloud P Train As input, and use the corresponding missing part point cloud G Train After supervised training and forward propagation of the first stage shared multi-layer perceptron (shared MLP), each point in the incomplete point cloud is mapped into a 256-dimensional feature vector. The first-stage multi-layer shared perceptron consists of a two-layer shared fully-connected network, wherein the first layer maps each point into 128-dimensional feature vectors, the second layer maps each point into 256-dimensional feature vectors, and the whole input point cloud is mapped into a matrix with dimensions of 2048 multiplied by 256;
step 2-2, setting 2048×256 dimensional matrix obtained in step 2-2-1 as x= (x) 1 ,x 2 ,x 3 ,...,x i ) It will be used as an input to the self-attention module, where x i A feature vector corresponding to one point in the input point cloud is obtained; mapping x to two feature spaces Q and K through two 1 x 1 convolution networks to calculate an attention score of the input point cloud, obtained by functions h (x) and v (x), respectively, where q= (h (x 1 ),h(x 2 ),h(x 3 ),...h(x i ))=((w h x 1 ,w h x 2 ,w h x 3 ,...w h x i ), K=(v(x 1 ),v(x 2 ),v(x 3 ),...v(x i ))=(w v x 1 ,w v x 2 ,w v x 3 ,...w v x i ),w h And w v The weight matrix to be learned corresponds to h (x) and v (x) respectively, and is realized by 1X 1 convolution, w h And w v Is 32 x 256; q is a query matrix with dimensions of 2048 multiplied by 32, the number of points representing the input point cloud is 2048, each point is represented by a 32-dimensional feature vector, namely the query value length of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimension is 2048×32, namely the key value dimension corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step 2-2-3, defining a functionTo calculate a scalar representing each point in the input point cloud relative to other pointsDependency relationship; function->Is defined as +.>Wherein i is the index of the point in the matrix Q obtained in the step 2-2-2, and j is the index of the point in the matrix K obtained in the step 2-2-2; for each point (expressed by a 32-dimensional vector) in Q, multiplying the key values corresponding to all points including the point one by one, namely multiplying the key values corresponding to each point in a matrix K by the 32-dimensional vector corresponding to each point, wherein the number of the input point cloud points is 2048, so that each point can calculate 2048 scalar quantities, and finally, the scalar quantities corresponding to all points are combined to obtain a matrix with dimensions of 2048 multiplied by 2048, namely, the attention score diagram corresponding to the input point cloud;
step 2-2-4, mapping each point in the input point cloud to a value matrix V to calculate an input signal at point j
Step 2-2-5, performing Softmax operation on the attention score map obtained in the step 2-2-3;
step 2-2-6, setting the output of the self-attention module as y, and mapping the input x to the output y by using a function phi;
step 2-2-7, carrying out maximum pooling;
step 2-2-8, finally generating a missing point cloud corresponding to the incomplete point cloud model;
and 2-2-9, obtaining a trained long-distance dependency relationship extraction network and a topology root tree generator.
The step 2-2-4 comprises the following steps: defining a function f (x j )=w f x j Mapping each point in the input point cloud to a value matrix V to calculate an input signal at point j, where w f The weight matrix to be learned is realized by 1X 1 convolution; the points in the value matrix V are in one-to-one correspondence with each key value in the key value matrix K, namely, the input signals at the point j are in one-to-one correspondence with the key values at the point j; wherein v= (f (x) 1 ),f(x 2 ),f(x 3 ),...f(x j ))=(w f x 1 ,w f x 2 ,w f x 3 ,...w f x j ) The dimension is 2048×128.
The step 2-2-5 comprises the following steps: definition formulaPerforming Softmax operation on the attention score map obtained in step 2-2-3, wherein q i,j Representing the attention score value of point i in the query matrix Q with respect to point j in the key matrix K.
The steps 2-2-6 include: definition y=Φ (x) = (y) 1 ,y 2 ,...,y i ,y N )=(φ(x 1 ),φ(x 2 ),...,φ(x i ),φ(x N ) Where N is the number of points in the input point cloud, x N For inputting a feature vector corresponding to a certain point in the point cloud, y i For outputting the ith point in the point cloud, adopting a formulaCalculated, where g (x i )=w g x i ,w g The weight matrix to be learned is realized by 1X 1 convolution; and finally, obtaining an output matrix y with the same dimension as the input feature matrix x, namely, the dimension of the output matrix y is 2048 multiplied by 256.
The steps 2-2-7 comprise: and (3) carrying out maximum pooling on the 2048 multiplied by 256-dimensional matrix obtained in the step (2-2-6), namely selecting the maximum value of each point in each dimension to combine into a 256-dimensional feature vector, obtaining the 2048 multiplied by 256-dimensional feature matrix with the same shape by a stacking method, and splicing the feature matrix with the 2048 multiplied by 256-dimensional matrix obtained in the step (2-2-6) to form the 2048 multiplied by 512-dimensional feature matrix fused with the long-distance dependency information.
The steps 2-2-8 include: forward propagation of 2048×512-dimensional feature matrices fused with long-distance dependency information through a second stage sharing multi-layer perceptron is carried out, each point in an input incomplete point cloud model is mapped into 1024-dimensional vectors, the whole incomplete point cloud is mapped into 2048×1024-dimensional matrices, and then maximum pooling is carried out on the 2048×1024-dimensional matrices, namely, the maximum value of each point in each dimension is selected to obtain 1024-dimensional global feature vectors; and inputting the global feature vector into a decoder of a topological root tree structure, and finally generating a deficiency point cloud corresponding to the imperfection point Yun Mo type.
The steps 2-2-9 comprise: comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function, and carrying out back propagation to finally obtain a trained long-distance dependency relationship extraction network and a topology root tree generator. Step 3 comprises the following steps:
the method of the invention aims at solving the problem of repairing the three-dimensional model. The sensor may be used to acquire a large amount of three-dimensional data quickly, but it is often difficult to acquire complete three-dimensional data. The repair and estimation of the complete model based on the partial incomplete model is also widely applied to the fields of computer vision, robots, virtual reality and the like, such as mixed model analysis, target detection and tracking, 3D reconstruction, style migration, robot roaming and grabbing and the like, which also makes the work very significant.
The beneficial effects are that: the method introduces a self-attention mechanism into the problem of three-dimensional point cloud restoration, does not only adopt layer sharing full connection to perform feature extraction, and is beneficial to modeling long-distance dependency relations among various points in the input point cloud. As can be seen from the visualization of the self-attentive map of FIG. 4 and the comparison result of FIG. 5, the points in the missing part point cloud generated by the network model of the invention can be finely coordinated with other remote points by adopting the self-attentive mechanism, and the feature extractor can generate the global features by utilizing the information of remote positions instead of local positions, so that the prediction result is enabled to have less noise and deformation, and the 3D model restoration effect is improved. The whole method system is efficient and practical. Meanwhile, as can be seen from tables 1 and 2, compared with other methods capable of repairing the three-dimensional point cloud model, the method provided by the invention has the advantages that the CD (Chamfer distance) value is obviously reduced, and the repairing performance is obviously improved.
Drawings
The foregoing and/or other advantages of the invention will become more apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings and detailed description.
FIG. 1a is an incomplete point cloud model before repair.
FIG. 1b is a point cloud model after repair.
FIG. 2 is a schematic architecture diagram of a self-attention module of the method of the present invention.
FIG. 3 is a system frame diagram of a feature extraction module of the method of the present invention.
Fig. 4 is a visual diagram of attention scores corresponding to an input point cloud model in the method of the present invention.
FIG. 5 is a graph showing the repair effect of the method of the present invention compared to other methods.
Fig. 6 is a flow chart of the present invention.
Detailed Description
As shown in fig. 6, the invention discloses a three-dimensional point cloud restoration method for extracting long-distance dependency relationship based on a self-attention mechanism, wherein one viewpoint is randomly selected from a plurality of preset viewpoints as a central point, and all the points are removed within a preset radius range to acquire an incomplete model under the viewpoint; inputting an incomplete model of a model training set and a corresponding missing part into a network of the method for training, and inputting an incomplete model of a model testing set into a trained network to obtain a missing part of the corresponding incomplete model; and synthesizing the incomplete model and the missing part together to obtain the final repaired model.
For a given set of some class of 3D models s= { S Train ,S Test Divided into training sets S Train = {s 1 ,s 2 ,...s i ,...,s n Sum test set S Test ={s n+1 ,s n+2 ,...,s n+j ,...,s n+m (s is therein i Representing the ith model in the training set, s n+j The j-th model in the test set is represented, and the invention completes the test set S through the following steps Test Repair of the model inside, target task as shown in fig. 1a, flow chart as shown in fig. 2, 3 and 6:
the method specifically comprises the following steps:
step 1, inputting a point cloud model data set and collecting data;
step 2, a long-distance dependency relationship extraction network is obtained by combining a Self-Attention mechanism-based method and a multi-layer perceptron MLP, an input point cloud is mapped into a global feature vector by the long-distance dependency relationship extraction network, and then a decoder of a topological root tree structure is adopted to generate a missing part of an incomplete point cloud;
and step 3, combining the incomplete point cloud and the generated missing part point cloud together to obtain a final repaired complete point cloud model.
Step 1 comprises the following steps:
step 1-1, setting and inputting a single three-dimensional point cloud model s, presetting 5 viewpoints which are respectively (1, 0), (0, 1), (1, 0, 1), (-1, 0), (-1, 0), and a plurality of different viewpoints to ensure that the missing part of the incomplete model has randomness when training and test data are acquired;
step 1-2, randomly selecting a viewpoint as a central point p, and presetting a radius r (the radius is set according to the removal point and is not a specific length in a mathematical sense, if the removal point is set to 25% of the origin cloud, then p is taken as the central point, and 25% of the closest points to the p point are removed);
step 1-3, for a three-dimensional point cloud model s, taking a randomly selected viewpoint p as a center, and removing points within a preset radius r to obtain an incomplete point cloud model; the removed set of points is the missing part point cloud corresponding to the incomplete point cloud model.
Step 2 comprises the following steps:
step 2-1, the input three-dimensional point cloud model data set S= { S Train ,S Test Dividing into training sets S Train ={s 1 ,s 2 ,...s i ,...,s n Sum test set S Test ={s n+1 ,s n+2 ,...,s n+j ,...,s n+m (s is therein i Representing the ith three-dimensional point in the training setCloud model s n+j Representing a jth three-dimensional point cloud model in the test set; i is 1-n, i is 1-m;
step 2-2, for training set S Train Collecting incomplete point cloud models P of each three-dimensional point cloud model under random view points Train ={p 1 ,p 2 ,...p i ,...,p n ' and corresponding missing part point cloud model G Train ={g 1 ,g 2 ,...g i ,...,g n Training as input to the whole network to obtain a trained long-range dependency extraction network and decoder of topological root tree structure, wherein p i Refers to training set S Train I-th three-dimensional point cloud model s in (3) i Corresponding incomplete point cloud model g i Refers to training set S Train I-th three-dimensional point cloud model s in (3) i A corresponding missing part point cloud model;
step 2-3, for test set S Test Collecting incomplete point cloud models P of each three-dimensional point cloud model under random view points Test ={p n+1 ,p n+2 ,...,p n+j ,...,p n+m Inputting into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p is as follows n+j Refers to test set S Test The j-th three-dimensional point cloud model s in (3) n+j And a corresponding incomplete point cloud model.
Step 2-2 includes the steps of:
step 2-2-1, training set S Train Incomplete point cloud P Train As input, and use the corresponding missing part point cloud G Train After supervised training and forward propagation of the first stage shared multi-layer perceptron (shared MLP), each point in the incomplete point cloud is mapped into a 256-dimensional feature vector. The first-stage multi-layer shared perceptron consists of a two-layer shared fully-connected network, wherein the first layer maps each point into 128-dimensional feature vectors, the second layer maps each point into 256-dimensional feature vectors, and the whole input point cloud is mapped into a matrix with dimensions of 2048 multiplied by 256;
step 2-2, setting 2048×256 dimensional matrix obtained in step 2-2-1 as x= (x) 1 ,x 2 ,x 3 ,...,x i ) It will be used as an input to the self-attention module, where x i A feature vector corresponding to one point in the input point cloud is obtained; mapping x to two feature spaces Q and K through two 1 x 1 convolution networks to calculate an attention score of the input point cloud, obtained by functions h (x) and v (x), respectively, where q= (h (x 1 ),h(x 2 ),h(x 3 ),...h(x i ))=((w h x 1 ,w h x 2 ,w h x 3 ,...w h x i ), K=(v(x 1 ),v(x 2 ),v(x 3 ),...v(x i ))=(w v x 1 ,w v x 2 ,w v x 3 ,...w v x i ),w h And w v The weight matrix to be learned corresponds to h (x) and v (x) respectively, and is realized by 1X 1 convolution, w h And w v Is 32 x 256; q is a query matrix with dimensions of 2048 multiplied by 32, the number of points representing the input point cloud is 2048, each point is represented by a 32-dimensional feature vector, namely the query value length of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimension is 2048×32, namely the key value dimension corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step 2-2-3, defining a functionTo calculate a scalar representing the dependency of each point in the input point cloud on other points; function->Is defined as +.>Wherein i is the index of the point in the matrix Q obtained in the step 2-2-2, and j is the index of the point in the matrix K obtained in the step 2-2-2; for each point in Q (with 32-dimensional vector representation), multiplying the key values corresponding to all points including the point by one, namely multiplying the key values by 32-dimensional vectors corresponding to each point in the matrix K, wherein the number of the input point cloud points is 2048, so that each point can be calculated to obtain 2048 scalar quantities, and finally, the scalar quantities corresponding to all points are combined to obtain a matrix with dimensions of 2048 multiplied by 2048, namely, an attention score diagram corresponding to the input point cloud;
step 2-2-4, defining a function f (x j )=w f x j Mapping each point in the input point cloud to a value matrix V to calculate the input signal at point j (the value matrix will be used to multiply the attention score map obtained in step 2-2-3 to obtain a weighted vector), where w f Is a weight matrix that needs to be learned, and is implemented by 1×1 convolution. The points in the value matrix V are in one-to-one correspondence with each key value in the key value matrix K, i.e. the input signal at point j is in one-to-one correspondence with the key value at point j. Wherein v= (f (x) 1 ),f(x 2 ),f(x 3 ),...f(x j ))=(w f x 1 ,w f x 2 ,w f x 3 ,...w f x j ) The dimension is 2048×128;
step 2-2-5, defining a formulaPerforming a Softmax operation on the attention score map obtained in step 2-2-3 (i.e., performing a Softmax operation on all the attention score values corresponding to each point such that the sum of the attention scores of each point with respect to all other points is 1), wherein q i,j A score value representing the attention of point i in the query matrix Q with respect to point j in the key matrix K;
step 2-2-6, setting the output of the self-attention module as y, and mapping the input x to the output y by using a function phi; definition y=Φ (x) = (y) 1 ,y 2 ,...,y i ,y N )=(φ(x 1 ),φ(x 2 ),...,φ(x i ),φ(x N ) Where N is the number of points in the input point cloud, x N For inputting a feature vector corresponding to a certain point in the point cloud, y i For outputting the ith point in the point cloud, adoptFormula (VI)Calculated, where g (x i )=w g x i ,w g The weight matrix to be learned is realized by 1X 1 convolution; and finally, obtaining an output matrix y with the same dimension as the input feature matrix x, namely, the dimension of the output matrix y is 2048 multiplied by 256.
2-2-7, carrying out maximum pooling on the 2048 multiplied by 256-dimensional matrix obtained in the 2-2-6, namely selecting the maximum value of each point under each dimension to combine into a 256-dimensional feature vector, obtaining the 2048 multiplied by 256-dimensional feature matrix with the same shape by a stacking method, and splicing the feature matrix with the 2048 multiplied by 256-dimensional matrix obtained in the 2-2-6 to form the 2048 multiplied by 512-dimensional feature matrix fused with long-distance dependency relationship information;
2-2-8, forward transmitting a 2048×512-dimensional feature matrix fused with long-distance dependency information through a shared multi-layer perceptron (shared MLP) of a second stage, mapping each point in an input incomplete point cloud model into a 1024-dimensional vector, mapping the whole incomplete point cloud into a 2048×1024-dimensional matrix, and carrying out maximum pooling on the 2048×1024-dimensional matrix, namely selecting the maximum value of each point in each dimension to obtain a 1024-dimensional global feature vector; inputting the global feature vector into a decoder of a topological root tree structure, and finally generating a missing point cloud corresponding to the incomplete point cloud model;
and 2-2-9, comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function, and carrying out back propagation to finally obtain a trained long-distance dependency relationship extraction network and a topology root tree generator.
Step 3 comprises the following steps:
and (3) synthesizing the point cloud of the missing part obtained in the step (2) and the incomplete point cloud together to obtain the final repaired complete point cloud model.
Examples
The objective task of the present invention is shown in fig. 1a and 1b, fig. 1a is an original model to be repaired, fig. 1b is a repaired model, the self-attention module architecture of the method of the present invention is shown in fig. 2, and the architecture of the whole global feature extractor is shown in fig. 3. The steps of the present invention are described below according to examples.
Step (1), collecting data of an input point cloud model data set;
step (1.1), setting and inputting a single three-dimensional point cloud model s, presetting 5 viewpoints which are respectively (1, 0), (0, 1), (1, 0, 1), (-1, 0), (-1, 0), and a plurality of different viewpoints so as to ensure that the missing part of the incomplete model has randomness when training and test data are acquired;
step (1.2), randomly selecting a viewpoint as a central point p, and presetting a radius r (the radius is set according to the removal point and is not a specific length in a mathematical sense, if the removal point is set to 25% of the origin cloud, then p is taken as the central point, and 25% of the closest points to the p point are removed);
step (1.3), for a three-dimensional point cloud model s, taking a randomly selected viewpoint p as a center, removing points within a preset radius r range, and obtaining an incomplete point cloud model; the removed set of points is the missing part point cloud corresponding to the incomplete point cloud model.
Step (2), a long-distance dependency relationship extraction network is obtained by combining a Self-Attention mechanism-based method and a multi-layer perceptron MLP, an input point cloud is mapped into a global feature vector by the long-distance dependency relationship extraction network, and then a decoder of a topological root tree structure is adopted to generate a missing part of an incomplete point cloud;
step (2.1), the input three-dimensional point cloud model data set S= { S Train ,S Test Dividing into training sets S Train ={s 1 ,s 2 ,...s i ,...,s n Sum test set S Test ={s n+1 ,s n+2 ,...,s n+j ,...,s n+m (s is therein i Representing an ith three-dimensional point cloud model in a training set, s n+j Represents the jth in the test setA three-dimensional point cloud model; i is 1-n, i is 1-m;
step (2.2), for training set S Train Collecting incomplete point cloud models P of each three-dimensional point cloud model under random view points Train ={p 1 ,p 2 ,...p i ,...,p n ' and corresponding missing part point cloud model G Train = {g 1 ,g 2 ,...g i ,...,g n Training as input to the whole network to obtain a trained long-range dependency extraction network and decoder of topological root tree structure, wherein p i Refers to training set S Train I-th three-dimensional point cloud model s in (3) i Corresponding incomplete point cloud model g i Refers to training set S Train I-th three-dimensional point cloud model s in (3) i A corresponding missing part point cloud model;
step (2.2.1), training set S Train Incomplete point cloud P Train As input, and use the corresponding missing part point cloud G Train After supervised training and forward propagation of the first stage shared multi-layer perceptron (shared MLP), each point in the incomplete point cloud is mapped into a 256-dimensional feature vector. The first-stage multi-layer shared perceptron consists of a two-layer shared fully-connected network, wherein the first layer maps each point into 128-dimensional feature vectors, the second layer maps each point into 256-dimensional feature vectors, and the whole input point cloud is mapped into a matrix with dimensions of 2048 multiplied by 256;
step (2.2.2), let 2048×256 dimensional matrix obtained in step 2-2-1 be x= (x) 1 ,x 2 ,x 3 ,...,x i ) It will be the input to the self-attention module, where x i A feature vector corresponding to one point in the input point cloud is obtained; mapping x to two feature spaces Q and K through two 1 x 1 convolutional networks to calculate the attention score of the input point cloud, obtained by functions h (x) and v (x), respectively, where q= (h (x 1 ),h(x 2 ),h(x 3 ),...h(x i ))=((w h x 1 ,w h x 2 ,w h x 3 ,...w h x i ), K=(v(x 1 ),v(x 2 ),v(x 3 ),...v(x i ))=(w v x 1 ,w v x 2 ,w v x 3 ,...w v x i ),w h And w v The weight matrix to be learned corresponds to h (x) and v (x) respectively, and is realized by 1X 1 convolution, w h And w v Is 32 x 256; q is a query matrix with dimensions of 2048 multiplied by 32, the number of points representing the input point cloud is 2048, each point is represented by a 32-dimensional feature vector, namely the query value length of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimension is 2048×32, namely the key value dimension corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step (2.2.3) of defining a functionTo calculate a scalar representing the dependency of each point in the input point cloud on other points; function->Is defined as +.>Wherein i is the index of the point in the matrix Q obtained in the step 2-2-2, and j is the index of the point in the matrix K obtained in the step 2-2-2; for each point (expressed by a 32-dimensional vector) in Q, multiplying the key values corresponding to all points including the point one by one, namely multiplying the key values corresponding to each point in a matrix K by the 32-dimensional vector corresponding to each point, wherein the number of the input point cloud points is 2048, so that each point can calculate 2048 scalar quantities, and finally, the scalar quantities corresponding to all points are combined to obtain a matrix with dimensions of 2048 multiplied by 2048, namely, the attention score diagram corresponding to the input point cloud;
step (2.2.4), defining a function f (x j )=w f x j Mapping each point in the input point cloud to a value matrix V to calculate the input signal at point j (the value matrix will be used to multiply the attention score map obtained in step 2-2-3 toResulting in a weight vector), where w f Is a weight matrix that needs to be learned, and is implemented by 1×1 convolution. The points in the value matrix V are in one-to-one correspondence with each key value in the key value matrix K, i.e. the input signal at point j is in one-to-one correspondence with the key value at point j. Wherein v= (f (x) 1 ),f(x 2 ),f(x 3 ),...f(x j ))=(w f x 1 ,w f x 2 ,w f x 3 ,...w f x j ) The dimension is 2048×128;
step (2.2.5), defining a formulaPerforming a Softmax operation on the attention score map obtained in step 2-2-3 (i.e., performing a Softmax operation on all the attention score values corresponding to each point such that the sum of the attention scores of each point with respect to all other points is 1), wherein q i,j A score value representing the attention of point i in the query matrix Q with respect to point j in the key matrix K;
step (2.2.6), setting the output of the self-attention module as y, and mapping the input x to the output y by using a function phi; define y=Φ (x) = (y) 1 ,y 2 ,...,y i ,y N )=(φ(x 1 ),φ(x 2 ),...,φ(x i ),φ(x N ) Where N is the number of points in the input point cloud, x N For inputting a feature vector corresponding to a certain point in the point cloud, y i For outputting the ith point in the point cloud, adopting a formulaCalculated, where g (x i )=w g x i ,w g The weight matrix to be learned is realized by 1X 1 convolution; and finally, obtaining an output matrix y with the same dimension as the input feature matrix x, namely, the dimension of the output matrix y is 2048 multiplied by 256.
Step (2.2.7), carrying out maximum pooling on the 2048 multiplied by 256-dimensional matrix obtained in the step 2-2-6, namely selecting the maximum value of each point under each dimension to combine into a 256-dimensional feature vector, obtaining the 2048 multiplied by 256-dimensional feature matrix with the same shape by a stacking method, and splicing the feature matrix with the 2048 multiplied by 256-dimensional matrix obtained in the step 2-2-6 to form the 2048 multiplied by 512-dimensional feature matrix fused with long-distance dependency information;
step (2.2.8), forward transmitting 2048×512-dimensional feature matrices fused with long-distance dependency information through a shared multi-layer perceptron (shared MLP) in a second stage, mapping each point in an input incomplete point cloud model into 1024-dimensional vectors, mapping the whole incomplete point cloud into 2048×1024-dimensional matrices, and carrying out maximum pooling on the 2048×1024-dimensional matrices, namely selecting the maximum value of each point in each dimension to obtain a 1024-dimensional global feature vector; inputting the global feature vector into a decoder of a topological root tree structure, and finally generating a missing point cloud corresponding to the incomplete point cloud model;
and (2.2.9) comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function, and carrying out back propagation to finally obtain a trained long-distance dependency relationship extraction network and a topology root tree generator.
Step (2.3), for test set S Test Collecting incomplete point cloud models P of each three-dimensional point cloud model under random view points Test ={p n+1 ,p n+2 ,...,p n+j ,...,p n+m Inputting into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p is as follows n+j Refers to test set S Test The j-th three-dimensional point cloud model s in (3) n+j And a corresponding incomplete point cloud model. The testing process mainly comprises the following steps:
step (2.3.1), the incomplete point cloud model P of the test model set under the random view point Test Inputting the data to a generator network trained cooperatively with a long-distance dependency relationship extraction network;
and (2.3.2) outputting the missing part point cloud corresponding to the incomplete model of the test model set under the random viewpoint.
And (3) synthesizing the incomplete point cloud and the generated missing part point cloud together to obtain the final repaired complete point cloud model.
Analysis of results
The experimental environment parameters of the method of the invention are as follows:
(1) The parameters of an experimental platform for data acquisition of the model are Ubuntu 16.04.4 LTS operating system, intel (R) Core (TM) i7-6850K CPU@3.60GHz and memory 32GB, and a Python programming language is adopted, so that the programming development environment is PyCharm 2019;
(2) The experimental platform parameters of the training and testing process of the network extraction based on the Self-attribute mechanism are Ubuntu 16.04.4 LTS operating system, intel (R) Core (TM) i7-6850K CPU@3.60GHz and memory 32GB, the display card is TITAN RTX GPU 24GB, the Python programming language is adopted, and the TensorFlow third-party open source library is adopted for realizing.
Comparative experimental results (shown in tables 1 and 2) of the method of the present invention and TopNet, folding, PCN, ATlasNet, pointNetFCAE were analyzed as follows:
experiments were performed on a subset of the recognized benchmark dataset shape net, the subset 8 different models, the class names of the datasets of each class are shown in the first column of Table 1, wherein each class name means airland, lamp, cabinet, car, chair, couch, table, watercraft; the partitioning of the training set and the test set is shown in the second column of table 1.
The final measure is the average Chamfer Distance (CD) of the complete model after repair. As shown by the CD comparisons of tables 1 and 2 (table 1 shows the model CD comparison of the method of the invention versus other methods for class 8 on ShapeNet dataset and table 2 shows the average CD comparison of the method of the invention versus other methods for all classes on ShapeNet dataset). The CDs shown in the table are all multiplied by 10 at the same time after calculation 5 . It can be seen from tables 1 and 2 that all of the class CD values and class average CD values of the methods of the present invention are lower than those of the other methods. FIG. 5 is a graph showing the repair results of the method of the present invention compared with other methods, and it can be seen that the method of the present invention repairs the missing portion of the incomplete point cloudNoise and deformation are obviously reduced, and the repairing effect is obviously improved.
TABLE 1
TABLE 2
ATlasNet | Folding | PCN | TopNet | PointNetFCAE | The method of the invention | |
Category Avg. | 94.4 | 74.6 | 67.1 | 63.9 | 97.6 | 55.8 |
In the Self-comparison experiment, a Self-section module for extracting the long-distance dependent information is removed, and the CD pair of the final experiment result is shown in a table 3, so that the optimization operation for extracting the long-distance dependent information can obviously reduce the final CD value of the complete model after repair.
In addition, the method of the invention visualizes the learned long-distance dependence information relationship, and the visualizations are shown in fig. 4. For each point in the incomplete 3D point cloud, there is a long-distance dependency information relationship corresponding to the point. In each line of fig. 4, the first picture shows three points of representative locations, and the other three show attention score graphs corresponding to these points. The method is used for extracting the long-distance dependent information by adopting a more specific method, and not only a fully-connected shared multi-layer perceptron layer is adopted, so that enough long-distance dependent information can be learned. Thereby significantly reducing the final CD value of the complete model after repair. Table 3 shows the final results of the method of the present invention in comparison to the results after optimization with the self-attention module removed for long-range dependent information extraction.
TABLE 3 Table 3
The present invention provides a three-dimensional point cloud repairing method for graphic processing, and the method and the way for realizing the technical scheme are numerous, the above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several improvements and modifications can be made without departing from the principle of the present invention, and these improvements and modifications should also be regarded as the protection scope of the present invention. The components not explicitly described in this embodiment can be implemented by using the prior art.
Claims (1)
1. A three-dimensional point cloud restoration method for graphic processing is characterized by comprising the following steps:
step 1, inputting a point cloud model data set and collecting data;
step 2, a Self-Attention Self-mechanism-based method and a multi-layer perceptron MLP are combined to obtain a long-distance dependency relationship extraction network, the long-distance dependency relationship extraction network is used for mapping an input point cloud into a global feature vector, and then a decoder of a topological root tree structure is used for generating a missing part of an incomplete point cloud;
step 3, combining the incomplete point cloud and the generated missing part point cloud together to obtain a final repaired complete point cloud model;
step 1 comprises the following steps:
step 1-1, setting and inputting a single three-dimensional point cloud model s, and presetting 5 viewpoints which are (1, 0), (0, 1), (1, 0, 1), (-1, 0), (-1, 0) respectively;
step 1-2, randomly selecting a viewpoint as a center point p, and presetting a radius r;
step 1-3, for a three-dimensional point cloud model s, taking a randomly selected viewpoint p as a center, and removing points within a preset radius r to obtain an incomplete point cloud model; the removed point set is the missing part point cloud corresponding to the incomplete point cloud model;
step 2 comprises the following steps:
step 2-1, the input three-dimensional point cloud model data set S= { S Train ,S Test Dividing into training sets S Train ={s 1 ,s 2 ,…s i ,…,s n Sum test set S Test ={s n+1 ,s n+2 ,…,s n+j ,…,s n+m (s is therein i Representing an ith three-dimensional point cloud model in a training set, s n+j Representing a jth three-dimensional point cloud model in the test set; i is 1-n, j is 1-m;
step 2-2, for training set S Train Collecting incomplete point cloud models P of each three-dimensional point cloud model under random view points Train ={p 1 ,p 2 ,…p i ,…,p n ' and corresponding missing part point cloud model G Train ={g 1 ,g 2 ,…g i ,…,g n Training as input to the whole network to obtain a trained long-range dependency extraction network and decoder of topological root tree structure, wherein p i Refers to training set S Train I-th three-dimensional point cloud model s in (3) i Corresponding incomplete point cloud model g i Refers to training set S Train I-th three-dimensional point cloud model s in (3) i A corresponding missing part point cloud model;
step 2-3, for test set S Test Collecting incomplete point cloud models P of each three-dimensional point cloud model under random view points Test ={p n+1 ,p n+2 ,…,p n+j ,…,p n+m Inputting into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p is as follows n+j Refers to test set S Test The j-th three-dimensional point cloud model s in (3) n+j A corresponding incomplete point cloud model;
step 2-2 includes the steps of:
step 2-2-1, training set S Train Incomplete point cloud P Train As input, and use the corresponding missing part point cloud G Train After forward propagation of the first-stage shared multi-layer perceptron, mapping each point in the incomplete point cloud into a 256-dimensional feature vector, wherein the first-stage multi-layer shared perceptron consists of two layers of shared fully-connected networks, the first layer maps each point into a 128-dimensional feature vector, the second layer maps each point into a 256-dimensional feature vector, and the whole input point cloud is mapped into a matrix with dimensions of 2048×256;
step 2-2, setting 2048×256 dimensional matrix obtained in step 2-2-1 as x= (x) 1 ,x 2 ,x 3 ,…,x i ) It will be the input to the self-attention module, where x i A feature vector corresponding to one point in the input point cloud is obtained; mapping x to two feature spaces Q and K through two 1 x 1 convolution networks to calculate the attention score of the input point cloud, obtained by functions h (x) and v (x), respectively, where q= (h (x 1 ),h(x 2 ),h(x 3 ),…h(x i ))=((w h x 1 ,w h x 2 ,w h x 3 ,…w h x i ),K=(v(x 1 ),v(x 2 ),v(x 3 ),…v(x i ))=(w v x 1 ,w v x 2 ,w v x 3 ,…w v x i ),w h And w v Is a weight matrix to be learned, and is realized by 1X 1 convolution and w is respectively corresponding to h (x) and v (x) h And w v Is 32 x 256; q is a query matrix with dimensions of 2048 multiplied by 32, the number of points representing the input point cloud is 2048, each point is represented by a 32-dimensional feature vector, namely the query value length of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimension is 2048×32, namely the key value dimension corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step 2-2-3, defining a functionTo calculate a scalar representing the dependency of each point in the input point cloud on other points; function->Is defined as +.>Wherein i is the index of the point in the matrix Q obtained in the step 2-2-2, and j is the index of the point in the matrix K obtained in the step 2-2-2; for each point in Q, multiplying the key values corresponding to all points including the point one by one, namely multiplying the key values corresponding to each point in the matrix K by 32-dimensional vectors corresponding to each point, wherein the number of the input point cloud points is 2048, so that each point can calculate 2048 scalar quantities, and finally, the scalar quantities corresponding to all points are combined to obtain a matrix with dimensions of 2048 multiplied by 2048, namely an attention score diagram corresponding to the input point cloud;
step 2-2-4, mapping each point in the input point cloud to a value matrix V to calculate an input signal at point j
Step 2-2-5, performing Softmax operation on the attention score map obtained in the step 2-2-3;
step 2-2-6, setting the output of the self-attention module as y, and mapping the input x to the output y by using a function phi;
step 2-2-7, carrying out maximum pooling;
step 2-2-8, finally generating a missing point cloud corresponding to the incomplete point cloud model;
step 2-2-9, obtaining a trained long-distance dependency relationship extraction network and a topology root tree generator;
the step 2-2-4 comprises the following steps: defining a function fx j )=w f x j Mapping each point in the input point cloud to a value matrix V to calculate an input signal at point j, where w f The weight matrix to be learned is realized by 1X 1 convolution; the points in the value matrix V are in one-to-one correspondence with each key value in the key value matrix K, namely, the input signals at the point j are in one-to-one correspondence with the key values at the point j; wherein v= (f (x) 1 ),f(x 2 ),f(x 3 ),…f(x j ))=(w f x 1 ,w f x 2 ,w f x 3 ,…w f x j ) The dimension is 2048×128;
the step 2-2-5 comprises the following steps: definition formulaPerforming Softmax operation on the attention score map obtained in step 2-2-3, wherein q i,j A score value representing the attention of point i in the query matrix Q with respect to point j in the key matrix K;
the steps 2-2-6 include: definition y=Φ (x) = (y) 1 ,y 2 ,…,y i ,y N )=(φ(x 1 ),φ(x 2 ),…,φ(x i ),φ(x N ) Where N is the number of points in the input point cloud, x N For inputting a feature vector corresponding to a certain point in the point cloud, y i For outputting the ith point in the point cloud, adopting a formulaCalculated to be thatMiddle g (x) i )=w g x i ,w g The weight matrix to be learned is realized by 1X 1 convolution; finally, an output matrix y with the same dimension as the input feature matrix x is obtained, namely the dimension of the output matrix y is 2048 multiplied by 256;
the steps 2-2-7 comprise: performing maximum pooling on the 2048×256-dimensional matrix obtained in the step 2-2-6, namely selecting the maximum value of each point under each dimension to combine into a 256-dimensional feature vector, obtaining the 2048×256-dimensional feature matrix with the same shape by a stacking method, and splicing the feature matrix with the 2048×256-dimensional matrix obtained in the step 2-2-6 to form the 2048×512-dimensional feature matrix fused with long-distance dependency information;
the steps 2-2-8 include: forward propagation of 2048×512-dimensional feature matrices fused with long-distance dependency information through a second stage sharing multi-layer perceptron is carried out, each point in an input incomplete point cloud model is mapped into 1024-dimensional vectors, the whole incomplete point cloud is mapped into 2048×1024-dimensional matrices, and then maximum pooling is carried out on the 2048×1024-dimensional matrices, namely, the maximum value of each point in each dimension is selected to obtain a 1024-dimensional global feature vector; inputting the global feature vector into a decoder of a topological root tree structure, and finally generating a missing point cloud corresponding to the incomplete point cloud model;
the steps 2-2-9 comprise: comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function, and carrying out back propagation to finally obtain a trained long-distance dependency relationship extraction network and a topology root tree generator.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110116229.1A CN112785526B (en) | 2021-01-28 | 2021-01-28 | Three-dimensional point cloud restoration method for graphic processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110116229.1A CN112785526B (en) | 2021-01-28 | 2021-01-28 | Three-dimensional point cloud restoration method for graphic processing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112785526A CN112785526A (en) | 2021-05-11 |
CN112785526B true CN112785526B (en) | 2023-12-05 |
Family
ID=75759307
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110116229.1A Active CN112785526B (en) | 2021-01-28 | 2021-01-28 | Three-dimensional point cloud restoration method for graphic processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112785526B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113298952B (en) * | 2021-06-11 | 2022-07-15 | 哈尔滨工程大学 | Incomplete point cloud classification method based on data expansion and similarity measurement |
CN113379646B (en) * | 2021-07-07 | 2022-06-21 | 厦门大学 | Algorithm for performing dense point cloud completion by using generated countermeasure network |
CN113486988B (en) * | 2021-08-04 | 2022-02-15 | 广东工业大学 | Point cloud completion device and method based on adaptive self-attention transformation network |
CN114663619B (en) * | 2022-02-24 | 2024-06-28 | 清华大学 | Three-dimensional point cloud object prediction method and device based on self-attention mechanism |
CN116051633B (en) * | 2022-12-15 | 2024-02-13 | 清华大学 | 3D point cloud target detection method and device based on weighted relation perception |
CN117671131B (en) * | 2023-10-20 | 2024-07-23 | 南京邮电大学 | Industrial part three-dimensional point cloud repairing method and device based on deep learning |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109389671A (en) * | 2018-09-25 | 2019-02-26 | 南京大学 | A kind of single image three-dimensional rebuilding method based on multistage neural network |
CN112070054A (en) * | 2020-09-17 | 2020-12-11 | 福州大学 | Vehicle-mounted laser point cloud marking classification method based on graph structure and attention mechanism |
CN112241997A (en) * | 2020-09-14 | 2021-01-19 | 西北大学 | Three-dimensional model establishing and repairing method and system based on multi-scale point cloud up-sampling |
EP3767521A1 (en) * | 2019-07-15 | 2021-01-20 | Promaton Holding B.V. | Object detection and instance segmentation of 3d point clouds based on deep learning |
CN112257637A (en) * | 2020-10-30 | 2021-01-22 | 福州大学 | Vehicle-mounted laser point cloud multi-target identification method integrating point cloud and multiple views |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10824862B2 (en) * | 2017-11-14 | 2020-11-03 | Nuro, Inc. | Three-dimensional object detection for autonomous robotic systems using image proposals |
-
2021
- 2021-01-28 CN CN202110116229.1A patent/CN112785526B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109389671A (en) * | 2018-09-25 | 2019-02-26 | 南京大学 | A kind of single image three-dimensional rebuilding method based on multistage neural network |
EP3767521A1 (en) * | 2019-07-15 | 2021-01-20 | Promaton Holding B.V. | Object detection and instance segmentation of 3d point clouds based on deep learning |
CN112241997A (en) * | 2020-09-14 | 2021-01-19 | 西北大学 | Three-dimensional model establishing and repairing method and system based on multi-scale point cloud up-sampling |
CN112070054A (en) * | 2020-09-17 | 2020-12-11 | 福州大学 | Vehicle-mounted laser point cloud marking classification method based on graph structure and attention mechanism |
CN112257637A (en) * | 2020-10-30 | 2021-01-22 | 福州大学 | Vehicle-mounted laser point cloud multi-target identification method integrating point cloud and multiple views |
Non-Patent Citations (4)
Title |
---|
PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation;Fenggen Yu 等;IEEE;全文 * |
一种基于深度学习的点云修复模型;贝子勒 等;无线通信技术(第02期);全文 * |
基于点云数据的三维目标识别和模型分割方法;牛辰庚 等;图学学报(第02期);全文 * |
基于神经网络的三维点云生成模型研究进展;卿都 等;机器人技术与应用(第06期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112785526A (en) | 2021-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112785526B (en) | Three-dimensional point cloud restoration method for graphic processing | |
Qiu et al. | Geometric back-projection network for point cloud classification | |
Xie et al. | Point clouds learning with attention-based graph convolution networks | |
Lu et al. | 3DCTN: 3D convolution-transformer network for point cloud classification | |
Chen et al. | The face image super-resolution algorithm based on combined representation learning | |
Kadam et al. | Detection and localization of multiple image splicing using MobileNet V1 | |
JP2023073231A (en) | Method and device for image processing | |
CN110032925B (en) | Gesture image segmentation and recognition method based on improved capsule network and algorithm | |
CN111898173A (en) | Empirical learning in virtual worlds | |
CN111553869B (en) | Method for complementing generated confrontation network image under space-based view angle | |
CN111898172A (en) | Empirical learning in virtual worlds | |
CN115690522B (en) | Target detection method based on multi-pooling fusion channel attention and application thereof | |
CN109740539B (en) | 3D object identification method based on ultralimit learning machine and fusion convolution network | |
CN109886297A (en) | A method of for identifying threedimensional model object from two dimensional image | |
CN110458178A (en) | The multi-modal RGB-D conspicuousness object detection method spliced more | |
CN114708380A (en) | Three-dimensional reconstruction method based on fusion of multi-view features and deep learning | |
KR20230071052A (en) | Apparatus and method for image processing | |
CN115830375A (en) | Point cloud classification method and device | |
CN112967296B (en) | Point cloud dynamic region graph convolution method, classification method and segmentation method | |
Alhamazani et al. | 3DCascade-GAN: Shape completion from single-view depth images | |
CN117635488A (en) | Light-weight point cloud completion method combining channel pruning and channel attention | |
CN112837420B (en) | Shape complement method and system for terracotta soldiers and horses point cloud based on multi-scale and folding structure | |
CN114219989A (en) | Foggy scene ship instance segmentation method based on interference suppression and dynamic contour | |
Wenju et al. | A graph attention feature pyramid network for 3D object detection in point clouds | |
Han et al. | Feature based sampling: a fast and robust sampling method for tasks using 3D point cloud |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |