CN112785526A - Three-dimensional point cloud repairing method for graphic processing - Google Patents
Three-dimensional point cloud repairing method for graphic processing Download PDFInfo
- Publication number
- CN112785526A CN112785526A CN202110116229.1A CN202110116229A CN112785526A CN 112785526 A CN112785526 A CN 112785526A CN 202110116229 A CN202110116229 A CN 202110116229A CN 112785526 A CN112785526 A CN 112785526A
- Authority
- CN
- China
- Prior art keywords
- point cloud
- point
- dimensional
- matrix
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 66
- 238000012545 processing Methods 0.000 title claims abstract description 9
- 239000013598 vector Substances 0.000 claims abstract description 60
- 238000000605 extraction Methods 0.000 claims abstract description 26
- 238000013507 mapping Methods 0.000 claims abstract description 19
- 230000007246 mechanism Effects 0.000 claims abstract description 15
- 230000002194 synthesizing effect Effects 0.000 claims abstract description 7
- 239000011159 matrix material Substances 0.000 claims description 97
- 238000012360 testing method Methods 0.000 claims description 52
- 238000012549 training Methods 0.000 claims description 40
- 230000006870 function Effects 0.000 claims description 20
- 238000011176 pooling Methods 0.000 claims description 10
- 230000008439 repair process Effects 0.000 description 10
- 238000013519 translation Methods 0.000 description 6
- 230000014616 translation Effects 0.000 description 6
- 230000001419 dependent effect Effects 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- 230000001537 neural effect Effects 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000012800 visualization Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000005034 decoration Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000010365 information processing Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- RTAQQCXQSZGOHL-UHFFFAOYSA-N Titanium Chemical group [Ti] RTAQQCXQSZGOHL-UHFFFAOYSA-N 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 235000020069 metaxa Nutrition 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/77—Retouching; Inpainting; Scratch removal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/20—Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computer Graphics (AREA)
- Health & Medical Sciences (AREA)
- Architecture (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computer Hardware Design (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a three-dimensional point cloud repairing method for graphic processing, which comprises the following steps: step 1, collecting data of an input point cloud model dataset; step 2, combining a Self Attention mechanism method based on Self-Attention with a multi-layer perceptron MLP to obtain a long-distance dependency extraction network, mapping the input point cloud into a global feature vector by using the long-distance dependency extraction network, and generating a missing part of the incomplete point cloud by using a decoder of a topological root tree structure; and 3, synthesizing the incomplete point clouds and the generated missing part point clouds together to obtain a finally repaired complete point cloud model.
Description
Technical Field
The invention belongs to the field of computer three-dimensional model processing and computer graphics, and particularly relates to a three-dimensional point cloud repairing method for graphic processing.
Background
In recent years, direct acquisition of large amounts of three-dimensional data in the real world has been achieved by using LiDAR scanners or depth sensors, such as Kinect, and stereo cameras, among others.
However, the 3D data obtained using these instruments is often incomplete, mainly due to the following: the scanning view angle of the scanner is limited, and the influence of the shielding, light refraction and reflection of non-target objects is avoided. Therefore, the geometric information and semantic information of the target object are lost frequently. Therefore, it is a necessary research topic to study how to repair the incomplete 3D model for more subsequent applications. In addition, 3D models also come in a number of representations such as point clouds, voxels, patches, distance fields, and the like. The use of point clouds to represent and process 3D data has received increasing attention because of its lower storage cost compared to other representations (e.g., 3D voxel grids), but it enables a more refined representation of 3D models. The occurrence of documents 1 c.r.qi, h.su, k.mo, and l.j.guibas.point: Deep learning on point segments for 3D classification and segmentation.2018 allows unordered point sets to be directly processed, which greatly facilitates the development of Deep learning architectures for processing point clouds and the development of other related studies, such as 3D scene reconstruction, 3D model segmentation, 3D model repair, and the like.
Documents 2w.yuan, t.khot, d.helld, c.mertz, and m.hebert.pcn: Point comparison network.international reference on 3D Vision 2018, documents 3z.huang, y.yu, j.xu, f.ni, and x.le.pf-Net: Point frame for 3D Point Cloud comparison. reference on Computer Vision and Pattern Recognition 2020, documents 4 w.yuan, l.p.tchapmi, s.h. rezakighi, i.reid, and s.savage.topnet: pixel decoder.map.2019, which use a Point map of different dimensions as a final feature vector to extract and input the final feature from the Point map of the incomplete Point. Meanwhile, since there is no relatively feasible method to define the local neighborhood of the point cloud at present, it is difficult to extract features through convolution operation like a 2D image. Thus, these methods rely heavily on multiple fully connected layers with similar architectures to capture the features of the input model and the dependencies between different points in the input point cloud. Furthermore, PF-Net indicates that low and medium layers in MLP often extract local information and that these local cannot be exploited to form global features simply by passing them to higher layers using a shared fully connected layer. This means that the method cannot efficiently extract enough long distance dependent information and embed it into the final global feature vector. Another problem is that even if limited long distance dependent information can be captured, it often needs to go through several fully connected layers to learn it. This may be detrimental to efficient capture of long range dependencies, for several reasons:
(1) more targeted models may be needed to represent these long-range dependencies;
(2) it may be difficult for an optimization algorithm to calculate certain parameter values that may be used to facilitate coordination among multiple layers to capture these long range dependencies.
(3) These parameter settings may be statistically vulnerable when applied to new models not seen by the network.
In recent years, the Attention mechanism has been combined with various methods (such as the recovery method and the GAN method) to capture long-distance dependent information. It was first started in the field of computer vision and has evolved considerably in the field of Natural Language Processing (NLP). The document 5 V.Mnih, N.Heess, A.Graves, and K.Kavukcugcuoglu.Current Models of Visual Attention.reference on Neural Information Processing Systems 2014 combines this mechanism with the RNN method for image classification studies, and obtains excellent performance. Document 6 d. bahdana u, k. cho, and y. bengio. neural Machine Translation by Jointly Learning to Align and translate. international Conference on Learning translations 2015. the Translation mechanism is applied to NLP, i.e. it is used to perform Translation and alignment simultaneously to complete the Machine Translation task. Self-attention allows the input elements in a collection to interact with each other to compute weights or responses and to find out which elements a certain element should place more attention on. Document 7 a. vaswani, n.shazer, n.parmar, j.uszkoreit, l.jones, a.n.gomez, and l.kaiser. orientation Is All You new. Conference on Neural Information Processing Systems 2017 shows that applying the Self-orientation mechanism to machine translation tasks achieves the best performance at the time. Document 8h.zhang, i.goodfellow, d.metaxas, j. Uszkoreit, and a.odena.self-orientation genetic additive networks.international Conference on Machine Learning 2019. integrating the Self-orientation mechanism into the GAN framework, the best performance in terms of class-condition image generation on ImageNet at the time was achieved.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to solve the technical problem of providing a three-dimensional point cloud repairing method for graphic processing aiming at the defects of the prior art, and particularly discloses a three-dimensional point cloud repairing method for long-distance dependent extraction based on a self-attention mechanism, which is used for repairing an incomplete 3D model and comprises the following steps:
step 1, inputting a point cloud model data set and collecting data;
step 2, combining a Self Attention mechanism method based on Self-Attention with a multi-layer perceptron MLP to obtain a long-distance dependency extraction network, mapping the input point cloud into a full-locality characteristic vector by using the long-distance dependency extraction network, and generating a missing part of the incomplete point cloud by using a decoder of a topological root tree structure;
and 3, synthesizing the incomplete point clouds and the generated missing part point clouds together to obtain a finally repaired complete point cloud model.
The step 1 comprises the following steps:
step 1-1, setting and inputting a single three-dimensional point cloud model s, presetting 5 viewpoints which are (1, 0, 0), (0, 0, 1), (1, 0, 1), (-1, 0, 0), (-1, 1, 0), and (1, 1, 0), and ensuring that the missing part of an incomplete model has randomness when training and testing data are collected;
step 1-2, randomly selecting a viewpoint as a central point p, and presetting a radius r (the radius is set according to the removal point number, and is not a specific length in the mathematical sense, if the removal point number is set to be 25% of the original point cloud, then taking p as the central point and removing 25% of points nearest to the p point);
step 1-3, regarding a three-dimensional point cloud model s, removing points within a preset radius r by taking a randomly selected viewpoint p as a center to obtain an incomplete point cloud model; the set of points removed is the missing portion of the point cloud corresponding to the incomplete point cloud model.
The step 2 comprises the following steps:
step 2-1, the input three-dimensional point cloud model data set S ═ STrain,STestDivide into training set STrain={s1,s2,...si,...,snAnd test set STest={sn+1,sn+2,...,sn+j,...,sn+mIn which s isiRepresenting the ith three-dimensional point cloud model, s, in the training setn+jRepresenting the jth three-dimensional point cloud model in the test set; i is 1-n, j is 1-m;
step 2-2, for training set STrainAcquiring an incomplete point cloud model P of each three-dimensional point cloud model at a random viewpointTrain={p1,p2,...pi,...,pnAnd the corresponding point cloud model G of the missing partTrain={g1,g2,...gi,...,gnTraining as the input of the whole network to obtain the trained long-distance dependency relationship extraction network and the decoder with the topology root tree structure, wherein piRefers to the training set STrainThe ith three-dimensional point cloud model s in (1)iCorresponding incomplete point cloud model, giRefers to the training set STrainThe ith three-dimensional point cloud model s iniA corresponding point cloud model of the missing part;
step 2-3, for test set STestCollecting incomplete point cloud model P of each three-dimensional point cloud model at random view pointTest={pn+1,pn+2,...,pn+j,...,pn+mAnd inputting the data into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p isn+jRefers to test set STestThe jth three-dimensional point cloud model s in (1)n+jA corresponding incomplete point cloud model.
Step 2-2 comprises the following steps:
step 2-2-1, training set STrainMiddle incomplete point cloud PTrainAs input, and using the corresponding missing part point cloud GTrainAfter the supervised training is carried out, after the forward propagation of a shared multi-layer perceptron (shared MLP) in the first stage, each point in the incomplete point cloud is mapped into a 256-dimensional feature vector. The first-stage multilayer shared sensing machine is composed of two layers of shared fully-connected networks, each point is mapped into a 128-dimensional characteristic vector by the first layer, each point is mapped into a 256-dimensional characteristic vector by the second layer, and the whole input point cloud is mapped into a matrix with dimensions of 2048 multiplied by 256;
step 2-2-2, the 2048 × 256-dimensional matrix obtained in step 2-2-1 is set to x ═ x (x)1,x2,x3,...,xi) It will be the input from the attention module, where xiA corresponding feature vector in an input point cloud is obtained; the attention scores of the input point clouds are calculated by mapping x onto two feature spaces Q and K through two 1 × 1 convolutional networks, which are obtained by functions h (x) and v (x), respectively, where Q ═ h (x)1),h(x2),h(x3),...h(xi))=((whx1,whx2,whx3,...whxi), K=(v(x1),v(x2),v(x3),...v(xi))=(wvx1,wvx2,wvx3,...wvxi),whAnd wvIs a weight matrix to be learned, which is respectively corresponding to h (x) and v (x) and is realized by 1 × 1 convolution, whAnd wvAll dimensions of (a) are 32 × 256; q is a 2048 × 32 dimensional query matrix representing the input point cloudThe number of points of (1) is 2048, and each point is represented by a 32-dimensional feature vector, namely the length of a query value of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimensionality is also 2048 multiplied by 32, namely the key value dimensionality corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step 2-2-3, defining a functionTo calculate a scalar quantity, which is used to represent the dependency relationship of each point in the input point cloud with respect to other points; function(s)Is defined asWherein i is the point index in the matrix Q obtained in the step 2-2-2, and j is the point index in the matrix K obtained in the step 2-2-2; for each point (represented by a 32-dimensional vector) in Q, multiplying key values corresponding to all points including the point by one, namely multiplying the key values by the 32-dimensional vector corresponding to each point in the matrix K, wherein the number of the input point clouds is 2048, so that each point can obtain 2048 scalars by calculation, and finally combining the scalars corresponding to all the points to obtain a matrix with the dimension of 2048 multiplied by 2048, namely an attention scoring graph corresponding to the input point clouds;
step 2-2-4, each point in the input point cloud is mapped to a value matrix V to calculate an input signal at a point j
Step 2-2-5, performing Softmax operation on the attention score map obtained in the step 2-2-3;
step 2-2-6, setting the output of the attention module as y, and mapping the input x to the output y by using a function phi;
step 2-2-7, performing maximum pooling;
step 2-2-8, finally generating a missing point cloud corresponding to the incomplete point cloud model;
and 2-2-9, obtaining the trained long-distance dependency relationship extraction network and the topology root tree generator.
Step 2-2-4 comprises: defining a function f (x)j)=wfxjMapping each point in the input point cloud to a matrix of values V to compute an input signal at point j, where wfIs a weight matrix to be learned, which is realized by 1 × 1 convolution; the point in the value matrix V corresponds to each key value in the key value matrix K one by one, namely the input signal at the point j corresponds to the key value at the point j one by one; wherein V ═ f (x)1),f(x2),f(x3),...f(xj))=(wfx1,wfx2,wfx3,...wfxj) The dimension is 2048 × 128.
The steps 2-2-5 comprise: defining a formulaPerforming Softmax operation on the attention score map obtained in step 2-2-3, wherein q isi,jRepresenting the attention score value of point i in the query matrix Q with respect to point j in the key value matrix K.
The steps 2-2-6 comprise: define y ═ phi (x) ═ y1,y2,...,yi,yN)=(φ(x1),φ(x2),...,φ(xi),φ(xN) N is the number of points in the input point cloud, x)NFor a feature vector, y, corresponding to a point in the input point cloudiFor outputting the ith point in the point cloud, a formula is adoptedCalculated to give, wherein g (x)i)=wgxi,wgIs a weight matrix to be learned, which is realized by 1 × 1 convolution; and finally, obtaining an output matrix y with the same dimension as the input characteristic matrix x, namely the dimension of the output matrix y is 2048 multiplied by 256.
The steps 2-2-7 comprise: and 2, performing maximum pooling on the 2048 × 256 dimensional matrixes obtained in the step 2-2-6, namely selecting the maximum value of each point under each dimension to combine into a 256 dimensional feature vector, obtaining 2048 × 256 dimensional feature matrixes with the same shape by using the feature vector through a stacking method, and splicing the feature matrixes and the 2048 × 256 dimensional matrixes obtained in the step 2-2-6 together to form the 2048 × 512 dimensional feature matrix fused with the long-distance dependency information.
The steps 2-2-8 comprise: the characteristic matrix of 2048 x 512 dimensions fused with the long-distance dependency relationship information is subjected to forward propagation of a second-stage shared multilayer perceptron, each point in an input incomplete point cloud model is mapped into a 1024-dimensional vector, the whole incomplete point cloud is mapped into a 2048 x 1024 matrix, and then the 2048 x 1024 matrix is subjected to maximum pooling, namely the maximum value of each point under each dimension is selected to obtain a 1024-dimensional global characteristic vector; and inputting the global feature vector into a decoder of a topological root tree structure, and finally generating the missing point cloud corresponding to the incomplete point cloud model.
The steps 2-2-9 comprise: and comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function of Loss, performing back propagation, and finally obtaining a trained long-distance dependency relationship extraction network and a topology root tree generator. The step 3 comprises the following steps:
the method of the present invention addresses the problem of three-dimensional model repair. Sensors can be used to acquire a large amount of three-dimensional data quickly, but it is often difficult to acquire complete three-dimensional data. Repairing and conjecturing the complete model based on the partial incomplete model are also widely applied to the fields of computer vision, robots, virtual reality and the like, such as mixed model analysis, target detection and tracking, 3D reconstruction, style migration, robot roaming and grabbing and the like, and the work is made to be very meaningful.
Has the advantages that: the method introduces a self-attention mechanism into the problem of three-dimensional point cloud restoration, does not only adopt layer sharing full connection for feature extraction, and is beneficial to modeling the long-distance dependence relationship among all points in the input point cloud. It can be seen from the visualization of the partial graph of the self-attention in fig. 4 and the comparison result in fig. 5 that, by using the self-attention mechanism, the points in the missing part point cloud generated by the network model of the present invention can be finely coordinated with other distant points, and the feature extractor can generate global features by using information of distant points rather than local positions, so that the prediction result has less noise and deformation, and the 3D model repairing effect is improved. The whole method system is efficient and practical. Meanwhile, as can be seen from tables 1 and 2, compared with other methods for repairing a three-dimensional point cloud model, the method provided by the invention has the advantages that the CD (Chamfer distance) value is remarkably reduced, and the repairing performance is remarkably improved.
Drawings
The foregoing and/or other advantages of the invention will become further apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings.
FIG. 1a is an incomplete point cloud model before repair.
FIG. 1b is the repaired point cloud model.
FIG. 2 is a block diagram of a self-attention module of the method of the present invention.
FIG. 3 is a block diagram of a feature extraction module of the method of the present invention.
FIG. 4 is a visualization of the attention scores corresponding to the input point cloud model in the method of the present invention.
FIG. 5 is a comparison of the repair effect of the method of the present invention with other methods.
FIG. 6 is a flow chart of the present invention.
Detailed Description
As shown in fig. 6, the invention discloses a three-dimensional point cloud repairing method for extracting long-distance dependency relationship based on a self-attention mechanism, which randomly selects a viewpoint from a plurality of preset viewpoints as a central point, and removes all points in a preset radius range to acquire an incomplete model under the viewpoint; inputting incomplete models and corresponding missing parts of a model training set into the network of the method for training, and inputting incomplete models of a model test set into the trained network to obtain the missing parts corresponding to the incomplete models; and then synthesizing the incomplete model and the missing part together to obtain the finally repaired model.
For a given set of 3D models of a certain class S ═ STrain,STestAre divided into training sets STrain= {s1,s2,...si,...,snAnd test set STest={sn+1,sn+2,...,sn+j,...,sn+mIn which s isiRepresenting the ith model, s, in the training setn+jThe j model in the test set is shown, and the invention completes the test set S through the following stepsTestThe model is repaired, the target task is shown in fig. 1a, and the flow charts are shown in fig. 2, fig. 3 and fig. 6:
the method specifically comprises the following steps:
step 1, inputting a point cloud model data set and collecting data;
step 2, combining a Self Attention mechanism method based on Self-Attention with a multi-layer perceptron MLP to obtain a long-distance dependency extraction network, mapping the input point cloud into a global feature vector by using the long-distance dependency extraction network, and generating a missing part of the incomplete point cloud by using a decoder of a topological root tree structure;
and 3, synthesizing the incomplete point clouds and the generated missing part point clouds together to obtain a finally repaired complete point cloud model.
The step 1 comprises the following steps:
step 1-1, setting and inputting a single three-dimensional point cloud model s, presetting 5 viewpoints which are (1, 0, 0), (0, 0, 1), (1, 0, 1), (-1, 0, 0), (-1, 1, 0), and (1, 1, 0), and ensuring that the missing part of an incomplete model has randomness when training and testing data are collected;
step 1-2, randomly selecting a viewpoint as a central point p, and presetting a radius r (the radius is set according to the removal point number, and is not a specific length in the mathematical sense, if the removal point number is set to be 25% of the original point cloud, then taking p as the central point and removing 25% of points nearest to the p point);
step 1-3, regarding a three-dimensional point cloud model s, removing points within a preset radius r by taking a randomly selected viewpoint p as a center to obtain an incomplete point cloud model; the set of points removed is the missing portion of the point cloud corresponding to the incomplete point cloud model.
The step 2 comprises the following steps:
step 2-1, the input three-dimensional point cloud model data set S ═ STrain,STestDivide into training set STrain={s1,s2,...si,...,snAnd test set STest={sn+1,sn+2,...,sn+j,...,sn+mIn which s isiRepresenting the ith three-dimensional point cloud model, s, in the training setn+jRepresenting the jth three-dimensional point cloud model in the test set; the value of i is 1-n, and the value of i is 1-m;
step 2-2, for training set STrainAcquiring an incomplete point cloud model P of each three-dimensional point cloud model at a random viewpointTrain={p1,p2,...pi,...,pnAnd the corresponding point cloud model G of the missing partTrain={g1,g2,...gi,...,gnTraining as the input of the whole network to obtain the trained long-distance dependency relationship extraction network and the decoder with the topology root tree structure, wherein piRefers to the training set STrainThe ith three-dimensional point cloud model s in (1)iCorresponding incomplete point cloud model, giRefers to the training set STrainThe ith three-dimensional point cloud model s iniA corresponding point cloud model of the missing part;
step 2-3, for test set STestCollecting incomplete point cloud model P of each three-dimensional point cloud model at random view pointTest={pn+1,pn+2,...,pn+j,...,pn+mAnd inputting the data into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p isn+jRefers to test set STestThe jth three-dimensional point cloud model s in (1)n+jA corresponding incomplete point cloud model.
Step 2-2 comprises the following steps:
step 2-2-1, trainingCollection STrainMiddle incomplete point cloud PTrainAs input, and using the corresponding missing part point cloud GTrainAfter the supervised training is carried out, after the forward propagation of a shared multi-layer perceptron (shared MLP) in the first stage, each point in the incomplete point cloud is mapped into a 256-dimensional feature vector. The first-stage multilayer shared sensing machine is composed of two layers of shared fully-connected networks, each point is mapped into a 128-dimensional characteristic vector by the first layer, each point is mapped into a 256-dimensional characteristic vector by the second layer, and the whole input point cloud is mapped into a matrix with dimensions of 2048 multiplied by 256;
step 2-2-2, the 2048 × 256-dimensional matrix obtained in step 2-2-1 is set to x ═ x (x)1,x2,x3,...,xi) It will be the input from the attention module, where xiA corresponding feature vector in an input point cloud is obtained; the attention scores of the input point clouds are calculated by mapping x onto two feature spaces Q and K through two 1 × 1 convolutional networks, which are obtained by functions h (x) and v (x), respectively, where Q ═ h (x)1),h(x2),h(x3),...h(xi))=((whx1,whx2,whx3,...whxi), K=(v(x1),v(x2),v(x3),...v(xi))=(wvx1,wvx2,wvx3,...wvxi),whAnd wvIs a weight matrix to be learned, which is respectively corresponding to h (x) and v (x) and is realized by 1 × 1 convolution, whAnd wvAll dimensions of (a) are 32 × 256; q is a query matrix with dimensions of 2048 multiplied by 32, the number of points representing the input point cloud is 2048, each point is represented by a 32-dimensional feature vector, namely the length of a query value of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimensionality is also 2048 multiplied by 32, namely the key value dimensionality corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step 2-2-3, defining a functionTo calculate a scalar quantity, which is used to represent the dependency relationship of each point in the input point cloud with respect to other points; function(s)Is defined asWherein i is the point index in the matrix Q obtained in the step 2-2-2, and j is the point index in the matrix K obtained in the step 2-2-2; for each point (represented by a 32-dimensional vector) in Q, multiplying key values corresponding to all points including the point by one, namely multiplying the key values by the 32-dimensional vector corresponding to each point in the matrix K, wherein the number of the input point clouds is 2048, so that each point can obtain 2048 scalars by calculation, and finally combining the scalars corresponding to all the points to obtain a matrix with the dimension of 2048 multiplied by 2048, namely an attention scoring graph corresponding to the input point clouds;
step 2-2-4, defining a function f (x)j)=wfxjEach point in the input point cloud is mapped to a matrix of values V to calculate the input signal at point j (the matrix of values will be used to multiply the attention score map obtained in step 2-2-3 to obtain a weighted vector), where wfIs a weight matrix to be learned, and is realized by 1 × 1 convolution. The point in the value matrix V corresponds to each key value in the key value matrix K, i.e., the input signal at the point j corresponds to the key value at the point j. Wherein V ═ f (x)1),f(x2),f(x3),...f(xj))=(wfx1,wfx2,wfx3,...wfxj) The dimension is 2048 × 128;
step 2-2-5, defining a formulaPerforming Softmax operation on the attention score map obtained in step 2-2-3 (i.e. performing Softmax operation on all the attention score values corresponding to each point so that the sum of the attention scores of each point with respect to all other points is 1), wherein q isi,jRepresenting queriesThe attention score value of the point i in the matrix Q relative to the point j in the key value matrix K;
step 2-2-6, setting the output of the attention module as y, and mapping the input x to the output y by using a function phi; define y ═ phi (x) ═ y1,y2,...,yi,yN)=(φ(x1),φ(x2),...,φ(xi),φ(xN) N is the number of points in the input point cloud, x)NFor a feature vector, y, corresponding to a point in the input point cloudiFor outputting the ith point in the point cloud, a formula is adoptedCalculated to give, wherein g (x)i)=wgxi,wgIs a weight matrix to be learned, which is realized by 1 × 1 convolution; and finally, obtaining an output matrix y with the same dimension as the input characteristic matrix x, namely the dimension of the output matrix y is 2048 multiplied by 256.
Step 2-2-7, performing maximum pooling on the 2048 × 256 dimensional matrixes obtained in the step 2-2-6, namely selecting the maximum value of each point under each dimension to combine into a 256-dimensional feature vector, obtaining 2048 × 256 dimensional feature matrixes with the same shape by the feature vector through a stacking method, and splicing the feature matrixes and the 2048 × 256 dimensional matrixes obtained in the step 2-2-6 together to form a 2048 × 512 dimensional feature matrix fused with long-distance dependency relationship information;
step 2-2-8, enabling the characteristic matrix of 2048 x 512 dimensions fused with the long-distance dependency relationship information to be transmitted in the forward direction of a second-stage shared multilayer perceptron (shared MLP), mapping each point in an input incomplete point cloud model into a 1024-dimensional vector, mapping the whole incomplete point cloud into a 2048 x 1024 matrix, and performing maximum pooling on the 2048 x 1024 matrix, namely selecting the maximum value of each point in each dimension to obtain a 1024-dimensional global characteristic vector; inputting the global feature vector into a decoder of a topological root tree structure, and finally generating a missing point cloud corresponding to the incomplete point cloud model;
and 2-2-9, comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function of Loss, and performing back propagation to finally obtain the trained long-distance dependency relationship extraction network and the topology root tree generator.
The step 3 comprises the following steps:
and (3) synthesizing the point cloud of the missing part obtained in the step (2) and the incomplete point cloud together to obtain a final repaired complete point cloud model.
Examples
The target tasks of the present invention are shown in fig. 1a and fig. 1b, fig. 1a is an original model needing to be repaired, fig. 1b is a repaired model, the structural system of the self-attention module of the method of the present invention is shown in fig. 2, and the structural system of the whole global feature extractor is shown in fig. 3. The steps of the present invention are described below according to examples.
Step (1), collecting data of an input point cloud model data set;
step (1.1), setting and inputting a single three-dimensional point cloud model s, presetting 5 viewpoints which are respectively (1, 0, 0), (0, 0, 1), (1, 0, 1), (-1, 0, 0), (-1, 1, 0), and multiple different viewpoints to ensure that the missing part of an incomplete model has randomness when training and testing data are collected;
step (1.2), randomly selecting a viewpoint as a central point p, and presetting a radius r (the radius is set according to the removal point number, and is not a specific length in the mathematical sense, if the removal point number is set to be 25% of the original point cloud, then taking p as the central point, and removing 25% of points nearest to the p point);
step (1.3), regarding a three-dimensional point cloud model s, removing points within a preset radius r by taking a randomly selected viewpoint p as a center to obtain an incomplete point cloud model; the set of points removed is the missing portion of the point cloud corresponding to the incomplete point cloud model.
Step (2), combining a Self Attention mechanism method based on Self-Attention with a multi-layer perceptron MLP to obtain a long-distance dependency extraction network, mapping the input point cloud into a global feature vector by using the long-distance dependency extraction network, and generating a missing part of the incomplete point cloud by using a decoder of a topological root tree structure;
step (2.1), the input three-dimensional point cloud model data set S ═ STrain,STestDivide into training set STrain={s1,s2,...si,...,snAnd test set STest={sn+1,sn+2,...,sn+j,...,sn+mIn which s isiRepresenting the ith three-dimensional point cloud model, s, in the training setn+jRepresenting the jth three-dimensional point cloud model in the test set; the value of i is 1-n, and the value of i is 1-m;
step (2.2), for training set STrainAcquiring an incomplete point cloud model P of each three-dimensional point cloud model under a random viewpointTrain={p1,p2,...pi,...,pnAnd the corresponding point cloud model G of the missing partTrain= {g1,g2,...gi,...,gnTraining as the input of the whole network to obtain the trained long-distance dependency relationship extraction network and the decoder with the topology root tree structure, wherein piRefers to the training set STrainThe ith three-dimensional point cloud model s iniCorresponding incomplete point cloud model, giRefers to the training set STrainThe ith three-dimensional point cloud model s iniA corresponding missing part point cloud model;
step (2.2.1), training set STrainMiddle incomplete point cloud PTrainAs input, and using the corresponding missing part to point cloud GTrainAfter the supervised training is carried out, after the forward propagation of a shared multi-layer perceptron (shared MLP) in the first stage, each point in the incomplete point cloud is mapped into a 256-dimensional feature vector. The first-stage multilayer shared sensing machine is composed of two layers of shared fully-connected networks, each point is mapped into a 128-dimensional characteristic vector by the first layer, each point is mapped into a 256-dimensional characteristic vector by the second layer, and the whole input point cloud is mapped into a matrix with dimensions of 2048 multiplied by 256;
step (2.2.2), the 2048 x 256 dimensional matrix obtained in step 2-2-1 is set asx=(x1,x2,x3,...,xi) It will be the input from the attention module, where xiA corresponding feature vector in an input point cloud is obtained; the attention scores for the input point clouds were calculated by mapping x onto two feature spaces Q and K through two 1 × 1 convolutional networks, obtained by the functions h (x) and v (x), respectively, where Q ═ h (x)1),h(x2),h(x3),...h(xi))=((whx1,whx2,whx3,...whxi), K=(v(x1),v(x2),v(x3),...v(xi))=(wvx1,wvx2,wvx3,...wvxi),whAnd wvIs a weight matrix to be learned, which is respectively corresponding to h (x) and v (x) and is realized by 1 × 1 convolution, whAnd wvAll dimensions of (a) are 32 × 256; q is a query matrix with dimensions of 2048 multiplied by 32, the number of points representing the input point cloud is 2048, each point is represented by a 32-dimensional feature vector, namely the length of a query value of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimensionality is also 2048 multiplied by 32, namely the key value dimensionality corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step (2.2.3), defining a functionTo calculate a scalar quantity, which is used to represent the dependency relationship of each point in the input point cloud with respect to other points; function(s)Is defined asWherein i is the point index in the matrix Q obtained in the step 2-2-2, and j is the point index in the matrix K obtained in the step 2-2-2; for each point in Q (represented by a 32-dimensional vector), the key values corresponding to all points, including the point itself, are multiplied one by oneMultiplying 32-dimensional vectors corresponding to each point in the matrix K, wherein the number of the input point cloud is 2048, so that each point can be calculated to obtain 2048 scalars, and finally combining the scalars corresponding to all the points to obtain a matrix with the dimension of 2048 multiplied by 2048, namely the matrix is an attention scoring graph corresponding to the input point cloud;
step (2.2.4), defining function f (x)j)=wfxjMapping each point in the input point cloud to a matrix of values V to compute the input signal at point j (the matrix of values will be used to multiply the attention score map obtained in step 2-2-3 to obtain a weighted vector), where wfIs a weight matrix to be learned, and is realized by 1 × 1 convolution. The point in the value matrix V corresponds to each key value in the key value matrix K, i.e., the input signal at the point j corresponds to the key value at the point j. Wherein V ═ f (x)1),f(x2),f(x3),...f(xj))=(wfx1,wfx2,wfx3,...wfxj) The dimension is 2048 × 128;
step (2.2.5), defining a formulaSoftmax operation is performed on the attention score map obtained in step 2-2-3 (i.e., Softmax operation is performed on all the attention score values corresponding to each point such that the sum of the attention scores of each point with respect to all other points is 1), where q isi,jRepresenting the attention score value of the point i in the query matrix Q relative to the point j in the key value matrix K;
step (2.2.6), setting the output of the attention module as y, and mapping the input x to the output y by using a function phi; define y ═ phi (x) ═ y1,y2,...,yi,yN)=(φ(x1),φ(x2),...,φ(xi),φ(xN) N is the number of points in the input point cloud, x)NFor a feature vector, y, corresponding to a point in the input point cloudiFor outputting the ith point in the point cloud, a formula is adoptedCalculated to give, wherein g (x)i)=wgxi,wgIs a weight matrix to be learned, which is realized by 1 × 1 convolution; and finally, obtaining an output matrix y with the same dimension as the input characteristic matrix x, namely the dimension of the output matrix y is 2048 multiplied by 256.
Step (2.2.7), performing maximum pooling on the 2048 × 256 dimensional matrixes obtained in the step 2-2-6, namely selecting the maximum value of each point under each dimension to combine into a 256-dimensional feature vector, obtaining 2048 × 256 dimensional feature matrixes with the same shape by the feature vector through a stacking method, and splicing the feature matrixes and the 2048 × 256 dimensional matrixes obtained in the step 2-2-6 together to form a 2048 × 512 dimensional feature matrix fused with long-distance dependency information;
step (2.2.8), the characteristic matrix of 2048 × 512 dimensions fused with the long-distance dependency relationship information is subjected to forward propagation of a shared multilayer perceptron (shared MLP) at a second stage, each point in an input incomplete point cloud model is mapped into a 1024-dimensional vector, the whole incomplete point cloud is mapped into a 2048 × 1024 matrix, and then the 2048 × 1024 matrix is subjected to maximum pooling, namely, the maximum value of each point under each dimension is selected to obtain a 1024-dimensional global characteristic vector; inputting the global feature vector into a decoder of a topological root tree structure, and finally generating a missing point cloud corresponding to the incomplete point cloud model;
and (2.2.9) comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function of Loss, and performing back propagation to finally obtain the trained long-distance dependency relationship extraction network and the topology root tree generator.
Step (2.3), for test set STestAcquiring an incomplete point cloud model P of each three-dimensional point cloud model under a random viewpointTest={pn+1,pn+2,...,pn+j,...,pn+mAnd inputting the data into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p isn+jRefers to test set STestTo (1)j three-dimensional point cloud models sn+jA corresponding incomplete point cloud model. The test process mainly comprises the following steps:
step (2.3.1), an incomplete point cloud model P of a test model set under random viewpoints is collectedTestInputting the data into a generator network which is well trained in cooperation with the long-distance dependency relationship extraction network;
and (2.3.2) outputting missing part point clouds corresponding to the incomplete models of the test model set under the random viewpoint.
And (3) synthesizing the incomplete point cloud and the generated missing part point cloud together to obtain a finally repaired complete point cloud model.
Analysis of results
The experimental environmental parameters of the method of the invention are as follows:
(1) the parameters of an experimental platform for collecting data of the model are Ubuntu 16.04.4 LTS operating system, Intel (R) core (TM) i7-6850K CPU @3.60GHz and internal memory 32GB, a Python programming language is adopted, and a programming development environment is Pycharm 2019;
(2) the parameters of an experimental platform for carrying out the training and testing process of the long-distance dependency extraction network based on the Self-extension mechanism are an Ubuntu 16.04.4 LTS operating system, an Intel (R) core (TM) i7-6850K CPU @3.60GHz and a memory 32GB, the video card is TITAN RTX GPU 24GB, a Python programming language is adopted, and a TensorFlow third-party open source library is adopted for realization.
The results of comparative experiments (shown in tables 1 and 2) of the method of the present invention with TopNet, Folding, PCN, atlas Net, and PointNetFCAE are analyzed as follows:
experiments were performed on a subset of the accepted reference data set ShapeNet, which is a subset of 8 different models, each class of data set having the category names, meaning airplan, Lamp, Cabinet, Car, Chair, Couch, Table, Watercraft, as shown in the first column of Table 1; the division of the training set and test set is shown in the second column of table 1.
The final metric is the average Chamfer Distance (CD) of the intact model after repair. CD pair as shown in tables 1 and 2Ratios (Table 1 shows the 8 different model CD comparisons of the method of the invention with other methods on the ShapeNet dataset, and Table 2 shows the average CD comparison of the method of the invention with other methods on all classes on the ShapeNet dataset) are shown. The CDs shown in the table are all multiplied by 10 at the same time after calculation5. As can be seen from tables 1 and 2, all the class CD values and the class mean CD values of the method of the present invention are lower than those of the other methods. Fig. 5 shows a comparison graph of the repair result of the method of the present invention with other methods, and it can be seen that the method of the present invention significantly reduces noise and deformation for the missing part of the incomplete point cloud repair, and significantly improves the repair effect.
TABLE 1
TABLE 2
ATlasNet | Folding | PCN | TopNet | PointNetFCAE | The method of the invention | |
Category Avg. | 94.4 | 74.6 | 67.1 | 63.9 | 97.6 | 55.8 |
In the Self-comparison experiment, the Self-annotation module for extracting long-distance dependence information is removed, and the CD ratio of the final experiment result is shown in Table 3, which shows that the optimization operation for extracting long-distance dependence information can significantly reduce the final CD value of the repaired complete model.
In addition, the method of the present invention visualizes the learned long-distance dependency information relationship, and the visualization result is shown in fig. 4. And each point in the incomplete 3D point cloud has a corresponding long-distance dependency information relationship. In each row of fig. 4, the first picture shows three representative positions of dots, and the other three pictures show the attention score plots corresponding to these dots. Because a more targeted method is adopted for extracting the long-distance dependence information, rather than only adopting a fully-connected shared multilayer perceptron layer, enough long-distance dependence information can be learned. Thereby significantly reducing the final CD value of the complete model after repair. Table 3 shows the comparison of the final results of the method of the present invention with the results after the optimization of the self-attention module without long distance dependent information extraction.
TABLE 3
The present invention provides a three-dimensional point cloud repairing method for image processing, and a plurality of methods and approaches for implementing the technical solution are provided, the above description is only a preferred embodiment of the present invention, it should be noted that, for those skilled in the art, a plurality of improvements and decorations can be made without departing from the principle of the present invention, and these improvements and decorations should also be regarded as the protection scope of the present invention. All the components not specified in the embodiment can be realized by the prior art.
Claims (10)
1. A three-dimensional point cloud repairing method for graphic processing is characterized by comprising the following steps:
step 1, inputting a point cloud model data set and collecting data;
step 2, combining a Self Attention mechanism method based on Self-Attention with a multi-layer perceptron MLP to obtain a long-distance dependency extraction network, mapping the input point cloud into a global feature vector by using the long-distance dependency extraction network, and generating a missing part of the incomplete point cloud by using a decoder of a topological root tree structure;
and 3, synthesizing the incomplete point clouds and the generated missing part point clouds together to obtain a finally repaired complete point cloud model.
2. The method of claim 1, wherein step 1 comprises the steps of:
step 1-1, setting and inputting a single three-dimensional point cloud model s, and presetting 5 viewpoints which are (1, 0, 0), (0, 0, 1), (1, 0, 1), (-1, 0, 0) and (-1, 1, 0);
step 1-2, randomly selecting a viewpoint as a central point p, and presetting a radius r;
step 1-3, regarding a three-dimensional point cloud model s, removing points within a preset radius r by taking a randomly selected viewpoint p as a center to obtain an incomplete point cloud model; the set of points removed is the missing portion of the point cloud corresponding to the incomplete point cloud model.
3. The method of claim 2, wherein step 2 comprises the steps of:
step 2-1, inputIs as a three-dimensional point cloud model data set S ═ STrain,STestDivide into training set STrain={s1,s2,...si,...,snAnd test set STest={sn+1,sn+2,...,sn+j,...,sn+mIn which s isiRepresenting the ith three-dimensional point cloud model, s, in the training setn+jRepresenting the jth three-dimensional point cloud model in the test set; i is 1-n, j is 1-m;
step 2-2, for training set STrainAcquiring an incomplete point cloud model P of each three-dimensional point cloud model at a random viewpointTrain={p1,p2,...pi,...,pnAnd the corresponding point cloud model G of the missing partTrain={g1,g2,..gi,...,gnTraining as the input of the whole network to obtain the trained long-distance dependency relationship extraction network and the decoder with the topology root tree structure, wherein piRefers to the training set STrainThe ith three-dimensional point cloud model s iniCorresponding incomplete point cloud model, giRefers to the training set STrainThe ith three-dimensional point cloud model s iniA corresponding missing part point cloud model;
step 2-3, for test set STestCollecting incomplete point cloud model P of each three-dimensional point cloud model under random view pointTest={pn+1,pn+2,...,pn+j,...,pn+mAnd inputting the data into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p isn+jRefers to test set STestThe jth three-dimensional point cloud model s in (1)n+jA corresponding incomplete point cloud model.
4. A method according to claim 3, characterized in that step 2-2 comprises the steps of:
step 2-2-1, training set STrainMiddle incomplete point cloud PTrainAs input, and using the corresponding missing partPoint cloud GTrainPerforming supervision training, and after forward propagation of a first-stage shared multilayer perceptron, mapping each point in the incomplete point cloud into a 256-dimensional characteristic vector, wherein the first-stage multilayer shared perceptron is composed of two layers of shared fully-connected networks, the first layer maps each point into a 128-dimensional characteristic vector, the second layer maps each point into a 256-dimensional characteristic vector, and the whole input point cloud is mapped into a matrix with dimensions of 2048 × 256;
step 2-2-2, the 2048 × 256-dimensional matrix obtained in step 2-2-1 is set to x ═ x (x)1,x2,x3,...,xi) It will be the input from the attention module, where xiA corresponding feature vector in an input point cloud is obtained; the attention scores of the input point cloud are calculated by mapping x to two feature spaces Q and K through two 1 × 1 convolution networks, which are obtained by functions h (x) and v (x), respectively, where Q ═ h (x)1),h(x2),h(x3),...h(xi))=((whx1,whx2,whx3,...whxi),K=(v(x1),v(x2),v(x3),...v(xi))=(wvx1,wvx2,wvx3,...wvxi),whAnd wvIs a weight matrix to be learned, corresponding to h (x) and v (x), respectively, and realized by 1 × 1 convolution, whAnd wvAll dimensions of (a) are 32 × 256; q is a query matrix with dimensions of 2048 multiplied by 32, the number of points representing the input point cloud is 2048, each point is represented by a 32-dimensional feature vector, namely the length of a query value of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimensionality is also 2048 multiplied by 32, namely the key value dimensionality corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step 2-2-3, defining a functionTo compute a scalar quantity representing the dependency of each point in the input point cloud with respect to other pointsIs a step of; function(s)Is defined asWherein i is the point index in the matrix Q obtained in the step 2-2-2, and j is the point index in the matrix K obtained in the step 2-2-2; for each point in Q, multiplying key values corresponding to all points including the point per se one by one, namely multiplying 32-dimensional vectors corresponding to each point in a matrix K, wherein the number of the input point cloud is 2048, so that each point can be calculated to obtain 2048 scalars, and finally combining the scalars corresponding to all the points to obtain a matrix with the dimension of 2048 multiplied by 2048, namely an attention scoring graph corresponding to the input point cloud;
step 2-2-4, each point in the input point cloud is mapped to a value matrix V to calculate an input signal at a point j
Step 2-2-5, performing Softmax operation on the attention score map obtained in the step 2-2-3;
step 2-2-6, setting the output of the attention module as y, and mapping the input x to the output y by using a function phi;
step 2-2-7, performing maximum pooling;
step 2-2-8, finally generating a missing point cloud corresponding to the incomplete point cloud model;
and 2-2-9, obtaining the trained long-distance dependency relationship extraction network and the topology root tree generator.
5. The method of claim 4, wherein steps 2-2-4 comprise: defining a function f (x)j)=wfxjMapping each point in the input point cloud to a matrix of values V to compute an input signal at point i, where wfIs a weight matrix to be learned, which is realized by 1 × 1 convolution; the point in the value matrix V corresponds to each key value in the key value matrix K one to one, that is, the input signal at the point j corresponds to the key value at the point j one to one; wherein V ═ f (x)1),f(x2),f(x3),...f(xj))=(wfx1,wfx2,wfx3,...wfxj) The dimension is 2048 × 128.
7. The method of claim 6, wherein steps 2-2-6 comprise: define y ═ phi (x) ═ y1,y2,...,yi,yN)=(φ(x1),φ(x2),...,φ(xi),φ(xN) N is the number of points in the input point cloud, x)NFor a feature vector, y, corresponding to a point in the input point cloudiFor outputting the ith point in the point cloud, a formula is adoptedCalculated to give, wherein g (x)i)=wgxi,wgIs a weight matrix to be learned, which is realized by 1 × 1 convolution; and finally, obtaining an output matrix y with the same dimension as the input characteristic matrix x, namely the dimension of the output matrix y is 2048 multiplied by 256.
8. The method of claim 7, wherein steps 2-2-7 comprise: and (3) performing maximum pooling on the 2048 × 256 dimensional matrixes obtained in the step 2-2-6, namely selecting the maximum value of each point under each dimension to combine into a 256 dimensional feature vector, obtaining 2048 × 256 dimensional feature matrixes with the same shape by using the feature vector through a stacking method, and splicing the feature matrixes and the 2048 × 256 dimensional matrixes obtained in the step 2-2-6 together to form the 2048 × 512 dimensional feature matrix fused with the long-distance dependency relationship information.
9. The method of claim 8, wherein steps 2-2-8 comprise: the characteristic matrix of 2048 x 512 dimensions fused with the long-distance dependency relationship information is subjected to forward propagation of a second stage shared multilayer perceptron, each point in an input incomplete point cloud model is mapped into a 1024-dimensional vector, the whole incomplete point cloud is mapped into a 2048 x 1024 matrix, and then the 2048 x 1024 matrix is subjected to maximum pooling, namely the maximum value of each point under each dimension is selected to obtain a 1024-dimensional global characteristic vector; and inputting the global feature vector into a decoder of a topological root tree structure, and finally generating the missing point cloud corresponding to the incomplete point cloud model.
10. The method of claim 9, wherein steps 2-2-9 comprise: and comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function of Loss, performing back propagation, and finally obtaining the trained long-distance dependency relationship extraction network and the topology root tree generator.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110116229.1A CN112785526B (en) | 2021-01-28 | 2021-01-28 | Three-dimensional point cloud restoration method for graphic processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110116229.1A CN112785526B (en) | 2021-01-28 | 2021-01-28 | Three-dimensional point cloud restoration method for graphic processing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112785526A true CN112785526A (en) | 2021-05-11 |
CN112785526B CN112785526B (en) | 2023-12-05 |
Family
ID=75759307
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110116229.1A Active CN112785526B (en) | 2021-01-28 | 2021-01-28 | Three-dimensional point cloud restoration method for graphic processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112785526B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113298952A (en) * | 2021-06-11 | 2021-08-24 | 哈尔滨工程大学 | Incomplete point cloud classification network based on data expansion and similarity measurement |
CN113379646A (en) * | 2021-07-07 | 2021-09-10 | 厦门大学 | Algorithm for performing dense point cloud completion by using generated countermeasure network |
CN113486988A (en) * | 2021-08-04 | 2021-10-08 | 广东工业大学 | Point cloud completion device and method based on adaptive self-attention transformation network |
CN114663619A (en) * | 2022-02-24 | 2022-06-24 | 清华大学 | Three-dimensional point cloud object prediction method and device based on self-attention mechanism |
CN116051633A (en) * | 2022-12-15 | 2023-05-02 | 清华大学 | 3D point cloud target detection method and device based on weighted relation perception |
CN117671131A (en) * | 2023-10-20 | 2024-03-08 | 南京邮电大学 | Industrial part three-dimensional point cloud repairing method and device based on deep learning |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109389671A (en) * | 2018-09-25 | 2019-02-26 | 南京大学 | A kind of single image three-dimensional rebuilding method based on multistage neural network |
US20190147245A1 (en) * | 2017-11-14 | 2019-05-16 | Nuro, Inc. | Three-dimensional object detection for autonomous robotic systems using image proposals |
CN112070054A (en) * | 2020-09-17 | 2020-12-11 | 福州大学 | Vehicle-mounted laser point cloud marking classification method based on graph structure and attention mechanism |
CN112241997A (en) * | 2020-09-14 | 2021-01-19 | 西北大学 | Three-dimensional model establishing and repairing method and system based on multi-scale point cloud up-sampling |
EP3767521A1 (en) * | 2019-07-15 | 2021-01-20 | Promaton Holding B.V. | Object detection and instance segmentation of 3d point clouds based on deep learning |
CN112257637A (en) * | 2020-10-30 | 2021-01-22 | 福州大学 | Vehicle-mounted laser point cloud multi-target identification method integrating point cloud and multiple views |
-
2021
- 2021-01-28 CN CN202110116229.1A patent/CN112785526B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190147245A1 (en) * | 2017-11-14 | 2019-05-16 | Nuro, Inc. | Three-dimensional object detection for autonomous robotic systems using image proposals |
CN109389671A (en) * | 2018-09-25 | 2019-02-26 | 南京大学 | A kind of single image three-dimensional rebuilding method based on multistage neural network |
EP3767521A1 (en) * | 2019-07-15 | 2021-01-20 | Promaton Holding B.V. | Object detection and instance segmentation of 3d point clouds based on deep learning |
CN112241997A (en) * | 2020-09-14 | 2021-01-19 | 西北大学 | Three-dimensional model establishing and repairing method and system based on multi-scale point cloud up-sampling |
CN112070054A (en) * | 2020-09-17 | 2020-12-11 | 福州大学 | Vehicle-mounted laser point cloud marking classification method based on graph structure and attention mechanism |
CN112257637A (en) * | 2020-10-30 | 2021-01-22 | 福州大学 | Vehicle-mounted laser point cloud multi-target identification method integrating point cloud and multiple views |
Non-Patent Citations (4)
Title |
---|
FENGGEN YU 等: "PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation", IEEE * |
卿都 等: "基于神经网络的三维点云生成模型研究进展", 机器人技术与应用, no. 06 * |
牛辰庚 等: "基于点云数据的三维目标识别和模型分割方法", 图学学报, no. 02 * |
贝子勒 等: "一种基于深度学习的点云修复模型", 无线通信技术, no. 02 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113298952A (en) * | 2021-06-11 | 2021-08-24 | 哈尔滨工程大学 | Incomplete point cloud classification network based on data expansion and similarity measurement |
CN113298952B (en) * | 2021-06-11 | 2022-07-15 | 哈尔滨工程大学 | Incomplete point cloud classification method based on data expansion and similarity measurement |
CN113379646A (en) * | 2021-07-07 | 2021-09-10 | 厦门大学 | Algorithm for performing dense point cloud completion by using generated countermeasure network |
CN113379646B (en) * | 2021-07-07 | 2022-06-21 | 厦门大学 | Algorithm for performing dense point cloud completion by using generated countermeasure network |
CN113486988A (en) * | 2021-08-04 | 2021-10-08 | 广东工业大学 | Point cloud completion device and method based on adaptive self-attention transformation network |
CN113486988B (en) * | 2021-08-04 | 2022-02-15 | 广东工业大学 | Point cloud completion device and method based on adaptive self-attention transformation network |
CN114663619A (en) * | 2022-02-24 | 2022-06-24 | 清华大学 | Three-dimensional point cloud object prediction method and device based on self-attention mechanism |
CN116051633A (en) * | 2022-12-15 | 2023-05-02 | 清华大学 | 3D point cloud target detection method and device based on weighted relation perception |
CN116051633B (en) * | 2022-12-15 | 2024-02-13 | 清华大学 | 3D point cloud target detection method and device based on weighted relation perception |
CN117671131A (en) * | 2023-10-20 | 2024-03-08 | 南京邮电大学 | Industrial part three-dimensional point cloud repairing method and device based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN112785526B (en) | 2023-12-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lee et al. | Context-aware synthesis and placement of object instances | |
CN112785526B (en) | Three-dimensional point cloud restoration method for graphic processing | |
CN109377448B (en) | Face image restoration method based on generation countermeasure network | |
CN111063021B (en) | Method and device for establishing three-dimensional reconstruction model of space moving target | |
Chen et al. | The face image super-resolution algorithm based on combined representation learning | |
US11328172B2 (en) | Method for fine-grained sketch-based scene image retrieval | |
CN110032925B (en) | Gesture image segmentation and recognition method based on improved capsule network and algorithm | |
Lee et al. | Deep architecture with cross guidance between single image and sparse lidar data for depth completion | |
CN112991350B (en) | RGB-T image semantic segmentation method based on modal difference reduction | |
CN111553869B (en) | Method for complementing generated confrontation network image under space-based view angle | |
CN111612008A (en) | Image segmentation method based on convolution network | |
CN115690522B (en) | Target detection method based on multi-pooling fusion channel attention and application thereof | |
CN111696196B (en) | Three-dimensional face model reconstruction method and device | |
CN109740539B (en) | 3D object identification method based on ultralimit learning machine and fusion convolution network | |
CN111768415A (en) | Image instance segmentation method without quantization pooling | |
CN112767478B (en) | Appearance guidance-based six-degree-of-freedom pose estimation method | |
Goncalves et al. | Deepdive: An end-to-end dehazing method using deep learning | |
CN112801945A (en) | Depth Gaussian mixture model skull registration method based on dual attention mechanism feature extraction | |
CN114782417A (en) | Real-time detection method for digital twin characteristics of fan based on edge enhanced image segmentation | |
CN112668662B (en) | Outdoor mountain forest environment target detection method based on improved YOLOv3 network | |
Yin et al. | [Retracted] Virtual Reconstruction Method of Regional 3D Image Based on Visual Transmission Effect | |
CN112149528A (en) | Panorama target detection method, system, medium and equipment | |
Dinh et al. | Feature engineering and deep learning for stereo matching under adverse driving conditions | |
Ogura et al. | Improving the visibility of nighttime images for pedestrian recognition using in‐vehicle camera | |
CN112767539B (en) | Image three-dimensional reconstruction method and system based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |