CN112785526A - Three-dimensional point cloud repairing method for graphic processing - Google Patents

Three-dimensional point cloud repairing method for graphic processing Download PDF

Info

Publication number
CN112785526A
CN112785526A CN202110116229.1A CN202110116229A CN112785526A CN 112785526 A CN112785526 A CN 112785526A CN 202110116229 A CN202110116229 A CN 202110116229A CN 112785526 A CN112785526 A CN 112785526A
Authority
CN
China
Prior art keywords
point cloud
point
dimensional
matrix
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110116229.1A
Other languages
Chinese (zh)
Other versions
CN112785526B (en
Inventor
朱佩浪
张岩
刘琨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202110116229.1A priority Critical patent/CN112785526B/en
Publication of CN112785526A publication Critical patent/CN112785526A/en
Application granted granted Critical
Publication of CN112785526B publication Critical patent/CN112785526B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/20Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computer Graphics (AREA)
  • Health & Medical Sciences (AREA)
  • Architecture (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computer Hardware Design (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a three-dimensional point cloud repairing method for graphic processing, which comprises the following steps: step 1, collecting data of an input point cloud model dataset; step 2, combining a Self Attention mechanism method based on Self-Attention with a multi-layer perceptron MLP to obtain a long-distance dependency extraction network, mapping the input point cloud into a global feature vector by using the long-distance dependency extraction network, and generating a missing part of the incomplete point cloud by using a decoder of a topological root tree structure; and 3, synthesizing the incomplete point clouds and the generated missing part point clouds together to obtain a finally repaired complete point cloud model.

Description

Three-dimensional point cloud repairing method for graphic processing
Technical Field
The invention belongs to the field of computer three-dimensional model processing and computer graphics, and particularly relates to a three-dimensional point cloud repairing method for graphic processing.
Background
In recent years, direct acquisition of large amounts of three-dimensional data in the real world has been achieved by using LiDAR scanners or depth sensors, such as Kinect, and stereo cameras, among others.
However, the 3D data obtained using these instruments is often incomplete, mainly due to the following: the scanning view angle of the scanner is limited, and the influence of the shielding, light refraction and reflection of non-target objects is avoided. Therefore, the geometric information and semantic information of the target object are lost frequently. Therefore, it is a necessary research topic to study how to repair the incomplete 3D model for more subsequent applications. In addition, 3D models also come in a number of representations such as point clouds, voxels, patches, distance fields, and the like. The use of point clouds to represent and process 3D data has received increasing attention because of its lower storage cost compared to other representations (e.g., 3D voxel grids), but it enables a more refined representation of 3D models. The occurrence of documents 1 c.r.qi, h.su, k.mo, and l.j.guibas.point: Deep learning on point segments for 3D classification and segmentation.2018 allows unordered point sets to be directly processed, which greatly facilitates the development of Deep learning architectures for processing point clouds and the development of other related studies, such as 3D scene reconstruction, 3D model segmentation, 3D model repair, and the like.
Documents 2w.yuan, t.khot, d.helld, c.mertz, and m.hebert.pcn: Point comparison network.international reference on 3D Vision 2018, documents 3z.huang, y.yu, j.xu, f.ni, and x.le.pf-Net: Point frame for 3D Point Cloud comparison. reference on Computer Vision and Pattern Recognition 2020, documents 4 w.yuan, l.p.tchapmi, s.h. rezakighi, i.reid, and s.savage.topnet: pixel decoder.map.2019, which use a Point map of different dimensions as a final feature vector to extract and input the final feature from the Point map of the incomplete Point. Meanwhile, since there is no relatively feasible method to define the local neighborhood of the point cloud at present, it is difficult to extract features through convolution operation like a 2D image. Thus, these methods rely heavily on multiple fully connected layers with similar architectures to capture the features of the input model and the dependencies between different points in the input point cloud. Furthermore, PF-Net indicates that low and medium layers in MLP often extract local information and that these local cannot be exploited to form global features simply by passing them to higher layers using a shared fully connected layer. This means that the method cannot efficiently extract enough long distance dependent information and embed it into the final global feature vector. Another problem is that even if limited long distance dependent information can be captured, it often needs to go through several fully connected layers to learn it. This may be detrimental to efficient capture of long range dependencies, for several reasons:
(1) more targeted models may be needed to represent these long-range dependencies;
(2) it may be difficult for an optimization algorithm to calculate certain parameter values that may be used to facilitate coordination among multiple layers to capture these long range dependencies.
(3) These parameter settings may be statistically vulnerable when applied to new models not seen by the network.
In recent years, the Attention mechanism has been combined with various methods (such as the recovery method and the GAN method) to capture long-distance dependent information. It was first started in the field of computer vision and has evolved considerably in the field of Natural Language Processing (NLP). The document 5 V.Mnih, N.Heess, A.Graves, and K.Kavukcugcuoglu.Current Models of Visual Attention.reference on Neural Information Processing Systems 2014 combines this mechanism with the RNN method for image classification studies, and obtains excellent performance. Document 6 d. bahdana u, k. cho, and y. bengio. neural Machine Translation by Jointly Learning to Align and translate. international Conference on Learning translations 2015. the Translation mechanism is applied to NLP, i.e. it is used to perform Translation and alignment simultaneously to complete the Machine Translation task. Self-attention allows the input elements in a collection to interact with each other to compute weights or responses and to find out which elements a certain element should place more attention on. Document 7 a. vaswani, n.shazer, n.parmar, j.uszkoreit, l.jones, a.n.gomez, and l.kaiser. orientation Is All You new. Conference on Neural Information Processing Systems 2017 shows that applying the Self-orientation mechanism to machine translation tasks achieves the best performance at the time. Document 8h.zhang, i.goodfellow, d.metaxas, j. Uszkoreit, and a.odena.self-orientation genetic additive networks.international Conference on Machine Learning 2019. integrating the Self-orientation mechanism into the GAN framework, the best performance in terms of class-condition image generation on ImageNet at the time was achieved.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to solve the technical problem of providing a three-dimensional point cloud repairing method for graphic processing aiming at the defects of the prior art, and particularly discloses a three-dimensional point cloud repairing method for long-distance dependent extraction based on a self-attention mechanism, which is used for repairing an incomplete 3D model and comprises the following steps:
step 1, inputting a point cloud model data set and collecting data;
step 2, combining a Self Attention mechanism method based on Self-Attention with a multi-layer perceptron MLP to obtain a long-distance dependency extraction network, mapping the input point cloud into a full-locality characteristic vector by using the long-distance dependency extraction network, and generating a missing part of the incomplete point cloud by using a decoder of a topological root tree structure;
and 3, synthesizing the incomplete point clouds and the generated missing part point clouds together to obtain a finally repaired complete point cloud model.
The step 1 comprises the following steps:
step 1-1, setting and inputting a single three-dimensional point cloud model s, presetting 5 viewpoints which are (1, 0, 0), (0, 0, 1), (1, 0, 1), (-1, 0, 0), (-1, 1, 0), and (1, 1, 0), and ensuring that the missing part of an incomplete model has randomness when training and testing data are collected;
step 1-2, randomly selecting a viewpoint as a central point p, and presetting a radius r (the radius is set according to the removal point number, and is not a specific length in the mathematical sense, if the removal point number is set to be 25% of the original point cloud, then taking p as the central point and removing 25% of points nearest to the p point);
step 1-3, regarding a three-dimensional point cloud model s, removing points within a preset radius r by taking a randomly selected viewpoint p as a center to obtain an incomplete point cloud model; the set of points removed is the missing portion of the point cloud corresponding to the incomplete point cloud model.
The step 2 comprises the following steps:
step 2-1, the input three-dimensional point cloud model data set S ═ STrain,STestDivide into training set STrain={s1,s2,...si,...,snAnd test set STest={sn+1,sn+2,...,sn+j,...,sn+mIn which s isiRepresenting the ith three-dimensional point cloud model, s, in the training setn+jRepresenting the jth three-dimensional point cloud model in the test set; i is 1-n, j is 1-m;
step 2-2, for training set STrainAcquiring an incomplete point cloud model P of each three-dimensional point cloud model at a random viewpointTrain={p1,p2,...pi,...,pnAnd the corresponding point cloud model G of the missing partTrain={g1,g2,...gi,...,gnTraining as the input of the whole network to obtain the trained long-distance dependency relationship extraction network and the decoder with the topology root tree structure, wherein piRefers to the training set STrainThe ith three-dimensional point cloud model s in (1)iCorresponding incomplete point cloud model, giRefers to the training set STrainThe ith three-dimensional point cloud model s iniA corresponding point cloud model of the missing part;
step 2-3, for test set STestCollecting incomplete point cloud model P of each three-dimensional point cloud model at random view pointTest={pn+1,pn+2,...,pn+j,...,pn+mAnd inputting the data into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p isn+jRefers to test set STestThe jth three-dimensional point cloud model s in (1)n+jA corresponding incomplete point cloud model.
Step 2-2 comprises the following steps:
step 2-2-1, training set STrainMiddle incomplete point cloud PTrainAs input, and using the corresponding missing part point cloud GTrainAfter the supervised training is carried out, after the forward propagation of a shared multi-layer perceptron (shared MLP) in the first stage, each point in the incomplete point cloud is mapped into a 256-dimensional feature vector. The first-stage multilayer shared sensing machine is composed of two layers of shared fully-connected networks, each point is mapped into a 128-dimensional characteristic vector by the first layer, each point is mapped into a 256-dimensional characteristic vector by the second layer, and the whole input point cloud is mapped into a matrix with dimensions of 2048 multiplied by 256;
step 2-2-2, the 2048 × 256-dimensional matrix obtained in step 2-2-1 is set to x ═ x (x)1,x2,x3,...,xi) It will be the input from the attention module, where xiA corresponding feature vector in an input point cloud is obtained; the attention scores of the input point clouds are calculated by mapping x onto two feature spaces Q and K through two 1 × 1 convolutional networks, which are obtained by functions h (x) and v (x), respectively, where Q ═ h (x)1),h(x2),h(x3),...h(xi))=((whx1,whx2,whx3,...whxi), K=(v(x1),v(x2),v(x3),...v(xi))=(wvx1,wvx2,wvx3,...wvxi),whAnd wvIs a weight matrix to be learned, which is respectively corresponding to h (x) and v (x) and is realized by 1 × 1 convolution, whAnd wvAll dimensions of (a) are 32 × 256; q is a 2048 × 32 dimensional query matrix representing the input point cloudThe number of points of (1) is 2048, and each point is represented by a 32-dimensional feature vector, namely the length of a query value of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimensionality is also 2048 multiplied by 32, namely the key value dimensionality corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step 2-2-3, defining a function
Figure BDA0002920746220000041
To calculate a scalar quantity, which is used to represent the dependency relationship of each point in the input point cloud with respect to other points; function(s)
Figure BDA0002920746220000042
Is defined as
Figure BDA0002920746220000043
Wherein i is the point index in the matrix Q obtained in the step 2-2-2, and j is the point index in the matrix K obtained in the step 2-2-2; for each point (represented by a 32-dimensional vector) in Q, multiplying key values corresponding to all points including the point by one, namely multiplying the key values by the 32-dimensional vector corresponding to each point in the matrix K, wherein the number of the input point clouds is 2048, so that each point can obtain 2048 scalars by calculation, and finally combining the scalars corresponding to all the points to obtain a matrix with the dimension of 2048 multiplied by 2048, namely an attention scoring graph corresponding to the input point clouds;
step 2-2-4, each point in the input point cloud is mapped to a value matrix V to calculate an input signal at a point j
Step 2-2-5, performing Softmax operation on the attention score map obtained in the step 2-2-3;
step 2-2-6, setting the output of the attention module as y, and mapping the input x to the output y by using a function phi;
step 2-2-7, performing maximum pooling;
step 2-2-8, finally generating a missing point cloud corresponding to the incomplete point cloud model;
and 2-2-9, obtaining the trained long-distance dependency relationship extraction network and the topology root tree generator.
Step 2-2-4 comprises: defining a function f (x)j)=wfxjMapping each point in the input point cloud to a matrix of values V to compute an input signal at point j, where wfIs a weight matrix to be learned, which is realized by 1 × 1 convolution; the point in the value matrix V corresponds to each key value in the key value matrix K one by one, namely the input signal at the point j corresponds to the key value at the point j one by one; wherein V ═ f (x)1),f(x2),f(x3),...f(xj))=(wfx1,wfx2,wfx3,...wfxj) The dimension is 2048 × 128.
The steps 2-2-5 comprise: defining a formula
Figure BDA0002920746220000051
Performing Softmax operation on the attention score map obtained in step 2-2-3, wherein q isi,jRepresenting the attention score value of point i in the query matrix Q with respect to point j in the key value matrix K.
The steps 2-2-6 comprise: define y ═ phi (x) ═ y1,y2,...,yi,yN)=(φ(x1),φ(x2),...,φ(xi),φ(xN) N is the number of points in the input point cloud, x)NFor a feature vector, y, corresponding to a point in the input point cloudiFor outputting the ith point in the point cloud, a formula is adopted
Figure BDA0002920746220000052
Calculated to give, wherein g (x)i)=wgxi,wgIs a weight matrix to be learned, which is realized by 1 × 1 convolution; and finally, obtaining an output matrix y with the same dimension as the input characteristic matrix x, namely the dimension of the output matrix y is 2048 multiplied by 256.
The steps 2-2-7 comprise: and 2, performing maximum pooling on the 2048 × 256 dimensional matrixes obtained in the step 2-2-6, namely selecting the maximum value of each point under each dimension to combine into a 256 dimensional feature vector, obtaining 2048 × 256 dimensional feature matrixes with the same shape by using the feature vector through a stacking method, and splicing the feature matrixes and the 2048 × 256 dimensional matrixes obtained in the step 2-2-6 together to form the 2048 × 512 dimensional feature matrix fused with the long-distance dependency information.
The steps 2-2-8 comprise: the characteristic matrix of 2048 x 512 dimensions fused with the long-distance dependency relationship information is subjected to forward propagation of a second-stage shared multilayer perceptron, each point in an input incomplete point cloud model is mapped into a 1024-dimensional vector, the whole incomplete point cloud is mapped into a 2048 x 1024 matrix, and then the 2048 x 1024 matrix is subjected to maximum pooling, namely the maximum value of each point under each dimension is selected to obtain a 1024-dimensional global characteristic vector; and inputting the global feature vector into a decoder of a topological root tree structure, and finally generating the missing point cloud corresponding to the incomplete point cloud model.
The steps 2-2-9 comprise: and comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function of Loss, performing back propagation, and finally obtaining a trained long-distance dependency relationship extraction network and a topology root tree generator. The step 3 comprises the following steps:
the method of the present invention addresses the problem of three-dimensional model repair. Sensors can be used to acquire a large amount of three-dimensional data quickly, but it is often difficult to acquire complete three-dimensional data. Repairing and conjecturing the complete model based on the partial incomplete model are also widely applied to the fields of computer vision, robots, virtual reality and the like, such as mixed model analysis, target detection and tracking, 3D reconstruction, style migration, robot roaming and grabbing and the like, and the work is made to be very meaningful.
Has the advantages that: the method introduces a self-attention mechanism into the problem of three-dimensional point cloud restoration, does not only adopt layer sharing full connection for feature extraction, and is beneficial to modeling the long-distance dependence relationship among all points in the input point cloud. It can be seen from the visualization of the partial graph of the self-attention in fig. 4 and the comparison result in fig. 5 that, by using the self-attention mechanism, the points in the missing part point cloud generated by the network model of the present invention can be finely coordinated with other distant points, and the feature extractor can generate global features by using information of distant points rather than local positions, so that the prediction result has less noise and deformation, and the 3D model repairing effect is improved. The whole method system is efficient and practical. Meanwhile, as can be seen from tables 1 and 2, compared with other methods for repairing a three-dimensional point cloud model, the method provided by the invention has the advantages that the CD (Chamfer distance) value is remarkably reduced, and the repairing performance is remarkably improved.
Drawings
The foregoing and/or other advantages of the invention will become further apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings.
FIG. 1a is an incomplete point cloud model before repair.
FIG. 1b is the repaired point cloud model.
FIG. 2 is a block diagram of a self-attention module of the method of the present invention.
FIG. 3 is a block diagram of a feature extraction module of the method of the present invention.
FIG. 4 is a visualization of the attention scores corresponding to the input point cloud model in the method of the present invention.
FIG. 5 is a comparison of the repair effect of the method of the present invention with other methods.
FIG. 6 is a flow chart of the present invention.
Detailed Description
As shown in fig. 6, the invention discloses a three-dimensional point cloud repairing method for extracting long-distance dependency relationship based on a self-attention mechanism, which randomly selects a viewpoint from a plurality of preset viewpoints as a central point, and removes all points in a preset radius range to acquire an incomplete model under the viewpoint; inputting incomplete models and corresponding missing parts of a model training set into the network of the method for training, and inputting incomplete models of a model test set into the trained network to obtain the missing parts corresponding to the incomplete models; and then synthesizing the incomplete model and the missing part together to obtain the finally repaired model.
For a given set of 3D models of a certain class S ═ STrain,STestAre divided into training sets STrain= {s1,s2,...si,...,snAnd test set STest={sn+1,sn+2,...,sn+j,...,sn+mIn which s isiRepresenting the ith model, s, in the training setn+jThe j model in the test set is shown, and the invention completes the test set S through the following stepsTestThe model is repaired, the target task is shown in fig. 1a, and the flow charts are shown in fig. 2, fig. 3 and fig. 6:
the method specifically comprises the following steps:
step 1, inputting a point cloud model data set and collecting data;
step 2, combining a Self Attention mechanism method based on Self-Attention with a multi-layer perceptron MLP to obtain a long-distance dependency extraction network, mapping the input point cloud into a global feature vector by using the long-distance dependency extraction network, and generating a missing part of the incomplete point cloud by using a decoder of a topological root tree structure;
and 3, synthesizing the incomplete point clouds and the generated missing part point clouds together to obtain a finally repaired complete point cloud model.
The step 1 comprises the following steps:
step 1-1, setting and inputting a single three-dimensional point cloud model s, presetting 5 viewpoints which are (1, 0, 0), (0, 0, 1), (1, 0, 1), (-1, 0, 0), (-1, 1, 0), and (1, 1, 0), and ensuring that the missing part of an incomplete model has randomness when training and testing data are collected;
step 1-2, randomly selecting a viewpoint as a central point p, and presetting a radius r (the radius is set according to the removal point number, and is not a specific length in the mathematical sense, if the removal point number is set to be 25% of the original point cloud, then taking p as the central point and removing 25% of points nearest to the p point);
step 1-3, regarding a three-dimensional point cloud model s, removing points within a preset radius r by taking a randomly selected viewpoint p as a center to obtain an incomplete point cloud model; the set of points removed is the missing portion of the point cloud corresponding to the incomplete point cloud model.
The step 2 comprises the following steps:
step 2-1, the input three-dimensional point cloud model data set S ═ STrain,STestDivide into training set STrain={s1,s2,...si,...,snAnd test set STest={sn+1,sn+2,...,sn+j,...,sn+mIn which s isiRepresenting the ith three-dimensional point cloud model, s, in the training setn+jRepresenting the jth three-dimensional point cloud model in the test set; the value of i is 1-n, and the value of i is 1-m;
step 2-2, for training set STrainAcquiring an incomplete point cloud model P of each three-dimensional point cloud model at a random viewpointTrain={p1,p2,...pi,...,pnAnd the corresponding point cloud model G of the missing partTrain={g1,g2,...gi,...,gnTraining as the input of the whole network to obtain the trained long-distance dependency relationship extraction network and the decoder with the topology root tree structure, wherein piRefers to the training set STrainThe ith three-dimensional point cloud model s in (1)iCorresponding incomplete point cloud model, giRefers to the training set STrainThe ith three-dimensional point cloud model s iniA corresponding point cloud model of the missing part;
step 2-3, for test set STestCollecting incomplete point cloud model P of each three-dimensional point cloud model at random view pointTest={pn+1,pn+2,...,pn+j,...,pn+mAnd inputting the data into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p isn+jRefers to test set STestThe jth three-dimensional point cloud model s in (1)n+jA corresponding incomplete point cloud model.
Step 2-2 comprises the following steps:
step 2-2-1, trainingCollection STrainMiddle incomplete point cloud PTrainAs input, and using the corresponding missing part point cloud GTrainAfter the supervised training is carried out, after the forward propagation of a shared multi-layer perceptron (shared MLP) in the first stage, each point in the incomplete point cloud is mapped into a 256-dimensional feature vector. The first-stage multilayer shared sensing machine is composed of two layers of shared fully-connected networks, each point is mapped into a 128-dimensional characteristic vector by the first layer, each point is mapped into a 256-dimensional characteristic vector by the second layer, and the whole input point cloud is mapped into a matrix with dimensions of 2048 multiplied by 256;
step 2-2-2, the 2048 × 256-dimensional matrix obtained in step 2-2-1 is set to x ═ x (x)1,x2,x3,...,xi) It will be the input from the attention module, where xiA corresponding feature vector in an input point cloud is obtained; the attention scores of the input point clouds are calculated by mapping x onto two feature spaces Q and K through two 1 × 1 convolutional networks, which are obtained by functions h (x) and v (x), respectively, where Q ═ h (x)1),h(x2),h(x3),...h(xi))=((whx1,whx2,whx3,...whxi), K=(v(x1),v(x2),v(x3),...v(xi))=(wvx1,wvx2,wvx3,...wvxi),whAnd wvIs a weight matrix to be learned, which is respectively corresponding to h (x) and v (x) and is realized by 1 × 1 convolution, whAnd wvAll dimensions of (a) are 32 × 256; q is a query matrix with dimensions of 2048 multiplied by 32, the number of points representing the input point cloud is 2048, each point is represented by a 32-dimensional feature vector, namely the length of a query value of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimensionality is also 2048 multiplied by 32, namely the key value dimensionality corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step 2-2-3, defining a function
Figure BDA0002920746220000091
To calculate a scalar quantity, which is used to represent the dependency relationship of each point in the input point cloud with respect to other points; function(s)
Figure BDA0002920746220000092
Is defined as
Figure BDA0002920746220000093
Wherein i is the point index in the matrix Q obtained in the step 2-2-2, and j is the point index in the matrix K obtained in the step 2-2-2; for each point (represented by a 32-dimensional vector) in Q, multiplying key values corresponding to all points including the point by one, namely multiplying the key values by the 32-dimensional vector corresponding to each point in the matrix K, wherein the number of the input point clouds is 2048, so that each point can obtain 2048 scalars by calculation, and finally combining the scalars corresponding to all the points to obtain a matrix with the dimension of 2048 multiplied by 2048, namely an attention scoring graph corresponding to the input point clouds;
step 2-2-4, defining a function f (x)j)=wfxjEach point in the input point cloud is mapped to a matrix of values V to calculate the input signal at point j (the matrix of values will be used to multiply the attention score map obtained in step 2-2-3 to obtain a weighted vector), where wfIs a weight matrix to be learned, and is realized by 1 × 1 convolution. The point in the value matrix V corresponds to each key value in the key value matrix K, i.e., the input signal at the point j corresponds to the key value at the point j. Wherein V ═ f (x)1),f(x2),f(x3),...f(xj))=(wfx1,wfx2,wfx3,...wfxj) The dimension is 2048 × 128;
step 2-2-5, defining a formula
Figure BDA0002920746220000094
Performing Softmax operation on the attention score map obtained in step 2-2-3 (i.e. performing Softmax operation on all the attention score values corresponding to each point so that the sum of the attention scores of each point with respect to all other points is 1), wherein q isi,jRepresenting queriesThe attention score value of the point i in the matrix Q relative to the point j in the key value matrix K;
step 2-2-6, setting the output of the attention module as y, and mapping the input x to the output y by using a function phi; define y ═ phi (x) ═ y1,y2,...,yi,yN)=(φ(x1),φ(x2),...,φ(xi),φ(xN) N is the number of points in the input point cloud, x)NFor a feature vector, y, corresponding to a point in the input point cloudiFor outputting the ith point in the point cloud, a formula is adopted
Figure BDA0002920746220000101
Calculated to give, wherein g (x)i)=wgxi,wgIs a weight matrix to be learned, which is realized by 1 × 1 convolution; and finally, obtaining an output matrix y with the same dimension as the input characteristic matrix x, namely the dimension of the output matrix y is 2048 multiplied by 256.
Step 2-2-7, performing maximum pooling on the 2048 × 256 dimensional matrixes obtained in the step 2-2-6, namely selecting the maximum value of each point under each dimension to combine into a 256-dimensional feature vector, obtaining 2048 × 256 dimensional feature matrixes with the same shape by the feature vector through a stacking method, and splicing the feature matrixes and the 2048 × 256 dimensional matrixes obtained in the step 2-2-6 together to form a 2048 × 512 dimensional feature matrix fused with long-distance dependency relationship information;
step 2-2-8, enabling the characteristic matrix of 2048 x 512 dimensions fused with the long-distance dependency relationship information to be transmitted in the forward direction of a second-stage shared multilayer perceptron (shared MLP), mapping each point in an input incomplete point cloud model into a 1024-dimensional vector, mapping the whole incomplete point cloud into a 2048 x 1024 matrix, and performing maximum pooling on the 2048 x 1024 matrix, namely selecting the maximum value of each point in each dimension to obtain a 1024-dimensional global characteristic vector; inputting the global feature vector into a decoder of a topological root tree structure, and finally generating a missing point cloud corresponding to the incomplete point cloud model;
and 2-2-9, comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function of Loss, and performing back propagation to finally obtain the trained long-distance dependency relationship extraction network and the topology root tree generator.
The step 3 comprises the following steps:
and (3) synthesizing the point cloud of the missing part obtained in the step (2) and the incomplete point cloud together to obtain a final repaired complete point cloud model.
Examples
The target tasks of the present invention are shown in fig. 1a and fig. 1b, fig. 1a is an original model needing to be repaired, fig. 1b is a repaired model, the structural system of the self-attention module of the method of the present invention is shown in fig. 2, and the structural system of the whole global feature extractor is shown in fig. 3. The steps of the present invention are described below according to examples.
Step (1), collecting data of an input point cloud model data set;
step (1.1), setting and inputting a single three-dimensional point cloud model s, presetting 5 viewpoints which are respectively (1, 0, 0), (0, 0, 1), (1, 0, 1), (-1, 0, 0), (-1, 1, 0), and multiple different viewpoints to ensure that the missing part of an incomplete model has randomness when training and testing data are collected;
step (1.2), randomly selecting a viewpoint as a central point p, and presetting a radius r (the radius is set according to the removal point number, and is not a specific length in the mathematical sense, if the removal point number is set to be 25% of the original point cloud, then taking p as the central point, and removing 25% of points nearest to the p point);
step (1.3), regarding a three-dimensional point cloud model s, removing points within a preset radius r by taking a randomly selected viewpoint p as a center to obtain an incomplete point cloud model; the set of points removed is the missing portion of the point cloud corresponding to the incomplete point cloud model.
Step (2), combining a Self Attention mechanism method based on Self-Attention with a multi-layer perceptron MLP to obtain a long-distance dependency extraction network, mapping the input point cloud into a global feature vector by using the long-distance dependency extraction network, and generating a missing part of the incomplete point cloud by using a decoder of a topological root tree structure;
step (2.1), the input three-dimensional point cloud model data set S ═ STrain,STestDivide into training set STrain={s1,s2,...si,...,snAnd test set STest={sn+1,sn+2,...,sn+j,...,sn+mIn which s isiRepresenting the ith three-dimensional point cloud model, s, in the training setn+jRepresenting the jth three-dimensional point cloud model in the test set; the value of i is 1-n, and the value of i is 1-m;
step (2.2), for training set STrainAcquiring an incomplete point cloud model P of each three-dimensional point cloud model under a random viewpointTrain={p1,p2,...pi,...,pnAnd the corresponding point cloud model G of the missing partTrain= {g1,g2,...gi,...,gnTraining as the input of the whole network to obtain the trained long-distance dependency relationship extraction network and the decoder with the topology root tree structure, wherein piRefers to the training set STrainThe ith three-dimensional point cloud model s iniCorresponding incomplete point cloud model, giRefers to the training set STrainThe ith three-dimensional point cloud model s iniA corresponding missing part point cloud model;
step (2.2.1), training set STrainMiddle incomplete point cloud PTrainAs input, and using the corresponding missing part to point cloud GTrainAfter the supervised training is carried out, after the forward propagation of a shared multi-layer perceptron (shared MLP) in the first stage, each point in the incomplete point cloud is mapped into a 256-dimensional feature vector. The first-stage multilayer shared sensing machine is composed of two layers of shared fully-connected networks, each point is mapped into a 128-dimensional characteristic vector by the first layer, each point is mapped into a 256-dimensional characteristic vector by the second layer, and the whole input point cloud is mapped into a matrix with dimensions of 2048 multiplied by 256;
step (2.2.2), the 2048 x 256 dimensional matrix obtained in step 2-2-1 is set asx=(x1,x2,x3,...,xi) It will be the input from the attention module, where xiA corresponding feature vector in an input point cloud is obtained; the attention scores for the input point clouds were calculated by mapping x onto two feature spaces Q and K through two 1 × 1 convolutional networks, obtained by the functions h (x) and v (x), respectively, where Q ═ h (x)1),h(x2),h(x3),...h(xi))=((whx1,whx2,whx3,...whxi), K=(v(x1),v(x2),v(x3),...v(xi))=(wvx1,wvx2,wvx3,...wvxi),whAnd wvIs a weight matrix to be learned, which is respectively corresponding to h (x) and v (x) and is realized by 1 × 1 convolution, whAnd wvAll dimensions of (a) are 32 × 256; q is a query matrix with dimensions of 2048 multiplied by 32, the number of points representing the input point cloud is 2048, each point is represented by a 32-dimensional feature vector, namely the length of a query value of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimensionality is also 2048 multiplied by 32, namely the key value dimensionality corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step (2.2.3), defining a function
Figure BDA0002920746220000121
To calculate a scalar quantity, which is used to represent the dependency relationship of each point in the input point cloud with respect to other points; function(s)
Figure BDA0002920746220000122
Is defined as
Figure BDA0002920746220000123
Wherein i is the point index in the matrix Q obtained in the step 2-2-2, and j is the point index in the matrix K obtained in the step 2-2-2; for each point in Q (represented by a 32-dimensional vector), the key values corresponding to all points, including the point itself, are multiplied one by oneMultiplying 32-dimensional vectors corresponding to each point in the matrix K, wherein the number of the input point cloud is 2048, so that each point can be calculated to obtain 2048 scalars, and finally combining the scalars corresponding to all the points to obtain a matrix with the dimension of 2048 multiplied by 2048, namely the matrix is an attention scoring graph corresponding to the input point cloud;
step (2.2.4), defining function f (x)j)=wfxjMapping each point in the input point cloud to a matrix of values V to compute the input signal at point j (the matrix of values will be used to multiply the attention score map obtained in step 2-2-3 to obtain a weighted vector), where wfIs a weight matrix to be learned, and is realized by 1 × 1 convolution. The point in the value matrix V corresponds to each key value in the key value matrix K, i.e., the input signal at the point j corresponds to the key value at the point j. Wherein V ═ f (x)1),f(x2),f(x3),...f(xj))=(wfx1,wfx2,wfx3,...wfxj) The dimension is 2048 × 128;
step (2.2.5), defining a formula
Figure BDA0002920746220000124
Softmax operation is performed on the attention score map obtained in step 2-2-3 (i.e., Softmax operation is performed on all the attention score values corresponding to each point such that the sum of the attention scores of each point with respect to all other points is 1), where q isi,jRepresenting the attention score value of the point i in the query matrix Q relative to the point j in the key value matrix K;
step (2.2.6), setting the output of the attention module as y, and mapping the input x to the output y by using a function phi; define y ═ phi (x) ═ y1,y2,...,yi,yN)=(φ(x1),φ(x2),...,φ(xi),φ(xN) N is the number of points in the input point cloud, x)NFor a feature vector, y, corresponding to a point in the input point cloudiFor outputting the ith point in the point cloud, a formula is adopted
Figure BDA0002920746220000131
Calculated to give, wherein g (x)i)=wgxi,wgIs a weight matrix to be learned, which is realized by 1 × 1 convolution; and finally, obtaining an output matrix y with the same dimension as the input characteristic matrix x, namely the dimension of the output matrix y is 2048 multiplied by 256.
Step (2.2.7), performing maximum pooling on the 2048 × 256 dimensional matrixes obtained in the step 2-2-6, namely selecting the maximum value of each point under each dimension to combine into a 256-dimensional feature vector, obtaining 2048 × 256 dimensional feature matrixes with the same shape by the feature vector through a stacking method, and splicing the feature matrixes and the 2048 × 256 dimensional matrixes obtained in the step 2-2-6 together to form a 2048 × 512 dimensional feature matrix fused with long-distance dependency information;
step (2.2.8), the characteristic matrix of 2048 × 512 dimensions fused with the long-distance dependency relationship information is subjected to forward propagation of a shared multilayer perceptron (shared MLP) at a second stage, each point in an input incomplete point cloud model is mapped into a 1024-dimensional vector, the whole incomplete point cloud is mapped into a 2048 × 1024 matrix, and then the 2048 × 1024 matrix is subjected to maximum pooling, namely, the maximum value of each point under each dimension is selected to obtain a 1024-dimensional global characteristic vector; inputting the global feature vector into a decoder of a topological root tree structure, and finally generating a missing point cloud corresponding to the incomplete point cloud model;
and (2.2.9) comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function of Loss, and performing back propagation to finally obtain the trained long-distance dependency relationship extraction network and the topology root tree generator.
Step (2.3), for test set STestAcquiring an incomplete point cloud model P of each three-dimensional point cloud model under a random viewpointTest={pn+1,pn+2,...,pn+j,...,pn+mAnd inputting the data into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p isn+jRefers to test set STestTo (1)j three-dimensional point cloud models sn+jA corresponding incomplete point cloud model. The test process mainly comprises the following steps:
step (2.3.1), an incomplete point cloud model P of a test model set under random viewpoints is collectedTestInputting the data into a generator network which is well trained in cooperation with the long-distance dependency relationship extraction network;
and (2.3.2) outputting missing part point clouds corresponding to the incomplete models of the test model set under the random viewpoint.
And (3) synthesizing the incomplete point cloud and the generated missing part point cloud together to obtain a finally repaired complete point cloud model.
Analysis of results
The experimental environmental parameters of the method of the invention are as follows:
(1) the parameters of an experimental platform for collecting data of the model are Ubuntu 16.04.4 LTS operating system, Intel (R) core (TM) i7-6850K CPU @3.60GHz and internal memory 32GB, a Python programming language is adopted, and a programming development environment is Pycharm 2019;
(2) the parameters of an experimental platform for carrying out the training and testing process of the long-distance dependency extraction network based on the Self-extension mechanism are an Ubuntu 16.04.4 LTS operating system, an Intel (R) core (TM) i7-6850K CPU @3.60GHz and a memory 32GB, the video card is TITAN RTX GPU 24GB, a Python programming language is adopted, and a TensorFlow third-party open source library is adopted for realization.
The results of comparative experiments (shown in tables 1 and 2) of the method of the present invention with TopNet, Folding, PCN, atlas Net, and PointNetFCAE are analyzed as follows:
experiments were performed on a subset of the accepted reference data set ShapeNet, which is a subset of 8 different models, each class of data set having the category names, meaning airplan, Lamp, Cabinet, Car, Chair, Couch, Table, Watercraft, as shown in the first column of Table 1; the division of the training set and test set is shown in the second column of table 1.
The final metric is the average Chamfer Distance (CD) of the intact model after repair. CD pair as shown in tables 1 and 2Ratios (Table 1 shows the 8 different model CD comparisons of the method of the invention with other methods on the ShapeNet dataset, and Table 2 shows the average CD comparison of the method of the invention with other methods on all classes on the ShapeNet dataset) are shown. The CDs shown in the table are all multiplied by 10 at the same time after calculation5. As can be seen from tables 1 and 2, all the class CD values and the class mean CD values of the method of the present invention are lower than those of the other methods. Fig. 5 shows a comparison graph of the repair result of the method of the present invention with other methods, and it can be seen that the method of the present invention significantly reduces noise and deformation for the missing part of the incomplete point cloud repair, and significantly improves the repair effect.
TABLE 1
Figure BDA0002920746220000141
Figure BDA0002920746220000151
TABLE 2
ATlasNet Folding PCN TopNet PointNetFCAE The method of the invention
Category Avg. 94.4 74.6 67.1 63.9 97.6 55.8
In the Self-comparison experiment, the Self-annotation module for extracting long-distance dependence information is removed, and the CD ratio of the final experiment result is shown in Table 3, which shows that the optimization operation for extracting long-distance dependence information can significantly reduce the final CD value of the repaired complete model.
In addition, the method of the present invention visualizes the learned long-distance dependency information relationship, and the visualization result is shown in fig. 4. And each point in the incomplete 3D point cloud has a corresponding long-distance dependency information relationship. In each row of fig. 4, the first picture shows three representative positions of dots, and the other three pictures show the attention score plots corresponding to these dots. Because a more targeted method is adopted for extracting the long-distance dependence information, rather than only adopting a fully-connected shared multilayer perceptron layer, enough long-distance dependence information can be learned. Thereby significantly reducing the final CD value of the complete model after repair. Table 3 shows the comparison of the final results of the method of the present invention with the results after the optimization of the self-attention module without long distance dependent information extraction.
TABLE 3
Figure BDA0002920746220000152
The present invention provides a three-dimensional point cloud repairing method for image processing, and a plurality of methods and approaches for implementing the technical solution are provided, the above description is only a preferred embodiment of the present invention, it should be noted that, for those skilled in the art, a plurality of improvements and decorations can be made without departing from the principle of the present invention, and these improvements and decorations should also be regarded as the protection scope of the present invention. All the components not specified in the embodiment can be realized by the prior art.

Claims (10)

1. A three-dimensional point cloud repairing method for graphic processing is characterized by comprising the following steps:
step 1, inputting a point cloud model data set and collecting data;
step 2, combining a Self Attention mechanism method based on Self-Attention with a multi-layer perceptron MLP to obtain a long-distance dependency extraction network, mapping the input point cloud into a global feature vector by using the long-distance dependency extraction network, and generating a missing part of the incomplete point cloud by using a decoder of a topological root tree structure;
and 3, synthesizing the incomplete point clouds and the generated missing part point clouds together to obtain a finally repaired complete point cloud model.
2. The method of claim 1, wherein step 1 comprises the steps of:
step 1-1, setting and inputting a single three-dimensional point cloud model s, and presetting 5 viewpoints which are (1, 0, 0), (0, 0, 1), (1, 0, 1), (-1, 0, 0) and (-1, 1, 0);
step 1-2, randomly selecting a viewpoint as a central point p, and presetting a radius r;
step 1-3, regarding a three-dimensional point cloud model s, removing points within a preset radius r by taking a randomly selected viewpoint p as a center to obtain an incomplete point cloud model; the set of points removed is the missing portion of the point cloud corresponding to the incomplete point cloud model.
3. The method of claim 2, wherein step 2 comprises the steps of:
step 2-1, inputIs as a three-dimensional point cloud model data set S ═ STrain,STestDivide into training set STrain={s1,s2,...si,...,snAnd test set STest={sn+1,sn+2,...,sn+j,...,sn+mIn which s isiRepresenting the ith three-dimensional point cloud model, s, in the training setn+jRepresenting the jth three-dimensional point cloud model in the test set; i is 1-n, j is 1-m;
step 2-2, for training set STrainAcquiring an incomplete point cloud model P of each three-dimensional point cloud model at a random viewpointTrain={p1,p2,...pi,...,pnAnd the corresponding point cloud model G of the missing partTrain={g1,g2,..gi,...,gnTraining as the input of the whole network to obtain the trained long-distance dependency relationship extraction network and the decoder with the topology root tree structure, wherein piRefers to the training set STrainThe ith three-dimensional point cloud model s iniCorresponding incomplete point cloud model, giRefers to the training set STrainThe ith three-dimensional point cloud model s iniA corresponding missing part point cloud model;
step 2-3, for test set STestCollecting incomplete point cloud model P of each three-dimensional point cloud model under random view pointTest={pn+1,pn+2,...,pn+j,...,pn+mAnd inputting the data into a trained network to obtain a missing part point cloud corresponding to the incomplete point cloud input model, wherein p isn+jRefers to test set STestThe jth three-dimensional point cloud model s in (1)n+jA corresponding incomplete point cloud model.
4. A method according to claim 3, characterized in that step 2-2 comprises the steps of:
step 2-2-1, training set STrainMiddle incomplete point cloud PTrainAs input, and using the corresponding missing partPoint cloud GTrainPerforming supervision training, and after forward propagation of a first-stage shared multilayer perceptron, mapping each point in the incomplete point cloud into a 256-dimensional characteristic vector, wherein the first-stage multilayer shared perceptron is composed of two layers of shared fully-connected networks, the first layer maps each point into a 128-dimensional characteristic vector, the second layer maps each point into a 256-dimensional characteristic vector, and the whole input point cloud is mapped into a matrix with dimensions of 2048 × 256;
step 2-2-2, the 2048 × 256-dimensional matrix obtained in step 2-2-1 is set to x ═ x (x)1,x2,x3,...,xi) It will be the input from the attention module, where xiA corresponding feature vector in an input point cloud is obtained; the attention scores of the input point cloud are calculated by mapping x to two feature spaces Q and K through two 1 × 1 convolution networks, which are obtained by functions h (x) and v (x), respectively, where Q ═ h (x)1),h(x2),h(x3),...h(xi))=((whx1,whx2,whx3,...whxi),K=(v(x1),v(x2),v(x3),...v(xi))=(wvx1,wvx2,wvx3,...wvxi),whAnd wvIs a weight matrix to be learned, corresponding to h (x) and v (x), respectively, and realized by 1 × 1 convolution, whAnd wvAll dimensions of (a) are 32 × 256; q is a query matrix with dimensions of 2048 multiplied by 32, the number of points representing the input point cloud is 2048, each point is represented by a 32-dimensional feature vector, namely the length of a query value of each point is 32; k is a key value matrix corresponding to the input point cloud, and the dimensionality is also 2048 multiplied by 32, namely the key value dimensionality corresponding to each point is 32; q and K will be used to calculate the attention score value of the input point cloud;
step 2-2-3, defining a function
Figure FDA0002920746210000021
To compute a scalar quantity representing the dependency of each point in the input point cloud with respect to other pointsIs a step of; function(s)
Figure FDA0002920746210000022
Is defined as
Figure FDA0002920746210000023
Wherein i is the point index in the matrix Q obtained in the step 2-2-2, and j is the point index in the matrix K obtained in the step 2-2-2; for each point in Q, multiplying key values corresponding to all points including the point per se one by one, namely multiplying 32-dimensional vectors corresponding to each point in a matrix K, wherein the number of the input point cloud is 2048, so that each point can be calculated to obtain 2048 scalars, and finally combining the scalars corresponding to all the points to obtain a matrix with the dimension of 2048 multiplied by 2048, namely an attention scoring graph corresponding to the input point cloud;
step 2-2-4, each point in the input point cloud is mapped to a value matrix V to calculate an input signal at a point j
Step 2-2-5, performing Softmax operation on the attention score map obtained in the step 2-2-3;
step 2-2-6, setting the output of the attention module as y, and mapping the input x to the output y by using a function phi;
step 2-2-7, performing maximum pooling;
step 2-2-8, finally generating a missing point cloud corresponding to the incomplete point cloud model;
and 2-2-9, obtaining the trained long-distance dependency relationship extraction network and the topology root tree generator.
5. The method of claim 4, wherein steps 2-2-4 comprise: defining a function f (x)j)=wfxjMapping each point in the input point cloud to a matrix of values V to compute an input signal at point i, where wfIs a weight matrix to be learned, which is realized by 1 × 1 convolution; the point in the value matrix V corresponds to each key value in the key value matrix K one to one, that is, the input signal at the point j corresponds to the key value at the point j one to one; wherein V ═ f (x)1),f(x2),f(x3),...f(xj))=(wfx1,wfx2,wfx3,...wfxj) The dimension is 2048 × 128.
6. The method of claim 5, wherein steps 2-2-5 comprise: defining a formula
Figure FDA0002920746210000031
Performing Softmax operation on the attention score map obtained in step 2-2-3, wherein q isi,jRepresenting the attention score value of point i in the query matrix Q with respect to point j in the key value matrix K.
7. The method of claim 6, wherein steps 2-2-6 comprise: define y ═ phi (x) ═ y1,y2,...,yi,yN)=(φ(x1),φ(x2),...,φ(xi),φ(xN) N is the number of points in the input point cloud, x)NFor a feature vector, y, corresponding to a point in the input point cloudiFor outputting the ith point in the point cloud, a formula is adopted
Figure FDA0002920746210000032
Calculated to give, wherein g (x)i)=wgxi,wgIs a weight matrix to be learned, which is realized by 1 × 1 convolution; and finally, obtaining an output matrix y with the same dimension as the input characteristic matrix x, namely the dimension of the output matrix y is 2048 multiplied by 256.
8. The method of claim 7, wherein steps 2-2-7 comprise: and (3) performing maximum pooling on the 2048 × 256 dimensional matrixes obtained in the step 2-2-6, namely selecting the maximum value of each point under each dimension to combine into a 256 dimensional feature vector, obtaining 2048 × 256 dimensional feature matrixes with the same shape by using the feature vector through a stacking method, and splicing the feature matrixes and the 2048 × 256 dimensional matrixes obtained in the step 2-2-6 together to form the 2048 × 512 dimensional feature matrix fused with the long-distance dependency relationship information.
9. The method of claim 8, wherein steps 2-2-8 comprise: the characteristic matrix of 2048 x 512 dimensions fused with the long-distance dependency relationship information is subjected to forward propagation of a second stage shared multilayer perceptron, each point in an input incomplete point cloud model is mapped into a 1024-dimensional vector, the whole incomplete point cloud is mapped into a 2048 x 1024 matrix, and then the 2048 x 1024 matrix is subjected to maximum pooling, namely the maximum value of each point under each dimension is selected to obtain a 1024-dimensional global characteristic vector; and inputting the global feature vector into a decoder of a topological root tree structure, and finally generating the missing point cloud corresponding to the incomplete point cloud model.
10. The method of claim 9, wherein steps 2-2-9 comprise: and comparing the generated missing point cloud part with the real missing point cloud corresponding to the incomplete point cloud, calculating a Loss function of Loss, performing back propagation, and finally obtaining the trained long-distance dependency relationship extraction network and the topology root tree generator.
CN202110116229.1A 2021-01-28 2021-01-28 Three-dimensional point cloud restoration method for graphic processing Active CN112785526B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110116229.1A CN112785526B (en) 2021-01-28 2021-01-28 Three-dimensional point cloud restoration method for graphic processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110116229.1A CN112785526B (en) 2021-01-28 2021-01-28 Three-dimensional point cloud restoration method for graphic processing

Publications (2)

Publication Number Publication Date
CN112785526A true CN112785526A (en) 2021-05-11
CN112785526B CN112785526B (en) 2023-12-05

Family

ID=75759307

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110116229.1A Active CN112785526B (en) 2021-01-28 2021-01-28 Three-dimensional point cloud restoration method for graphic processing

Country Status (1)

Country Link
CN (1) CN112785526B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113298952A (en) * 2021-06-11 2021-08-24 哈尔滨工程大学 Incomplete point cloud classification network based on data expansion and similarity measurement
CN113379646A (en) * 2021-07-07 2021-09-10 厦门大学 Algorithm for performing dense point cloud completion by using generated countermeasure network
CN113486988A (en) * 2021-08-04 2021-10-08 广东工业大学 Point cloud completion device and method based on adaptive self-attention transformation network
CN114663619A (en) * 2022-02-24 2022-06-24 清华大学 Three-dimensional point cloud object prediction method and device based on self-attention mechanism
CN116051633A (en) * 2022-12-15 2023-05-02 清华大学 3D point cloud target detection method and device based on weighted relation perception
CN117671131A (en) * 2023-10-20 2024-03-08 南京邮电大学 Industrial part three-dimensional point cloud repairing method and device based on deep learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389671A (en) * 2018-09-25 2019-02-26 南京大学 A kind of single image three-dimensional rebuilding method based on multistage neural network
US20190147245A1 (en) * 2017-11-14 2019-05-16 Nuro, Inc. Three-dimensional object detection for autonomous robotic systems using image proposals
CN112070054A (en) * 2020-09-17 2020-12-11 福州大学 Vehicle-mounted laser point cloud marking classification method based on graph structure and attention mechanism
CN112241997A (en) * 2020-09-14 2021-01-19 西北大学 Three-dimensional model establishing and repairing method and system based on multi-scale point cloud up-sampling
EP3767521A1 (en) * 2019-07-15 2021-01-20 Promaton Holding B.V. Object detection and instance segmentation of 3d point clouds based on deep learning
CN112257637A (en) * 2020-10-30 2021-01-22 福州大学 Vehicle-mounted laser point cloud multi-target identification method integrating point cloud and multiple views

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190147245A1 (en) * 2017-11-14 2019-05-16 Nuro, Inc. Three-dimensional object detection for autonomous robotic systems using image proposals
CN109389671A (en) * 2018-09-25 2019-02-26 南京大学 A kind of single image three-dimensional rebuilding method based on multistage neural network
EP3767521A1 (en) * 2019-07-15 2021-01-20 Promaton Holding B.V. Object detection and instance segmentation of 3d point clouds based on deep learning
CN112241997A (en) * 2020-09-14 2021-01-19 西北大学 Three-dimensional model establishing and repairing method and system based on multi-scale point cloud up-sampling
CN112070054A (en) * 2020-09-17 2020-12-11 福州大学 Vehicle-mounted laser point cloud marking classification method based on graph structure and attention mechanism
CN112257637A (en) * 2020-10-30 2021-01-22 福州大学 Vehicle-mounted laser point cloud multi-target identification method integrating point cloud and multiple views

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
FENGGEN YU 等: "PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation", IEEE *
卿都 等: "基于神经网络的三维点云生成模型研究进展", 机器人技术与应用, no. 06 *
牛辰庚 等: "基于点云数据的三维目标识别和模型分割方法", 图学学报, no. 02 *
贝子勒 等: "一种基于深度学习的点云修复模型", 无线通信技术, no. 02 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113298952A (en) * 2021-06-11 2021-08-24 哈尔滨工程大学 Incomplete point cloud classification network based on data expansion and similarity measurement
CN113298952B (en) * 2021-06-11 2022-07-15 哈尔滨工程大学 Incomplete point cloud classification method based on data expansion and similarity measurement
CN113379646A (en) * 2021-07-07 2021-09-10 厦门大学 Algorithm for performing dense point cloud completion by using generated countermeasure network
CN113379646B (en) * 2021-07-07 2022-06-21 厦门大学 Algorithm for performing dense point cloud completion by using generated countermeasure network
CN113486988A (en) * 2021-08-04 2021-10-08 广东工业大学 Point cloud completion device and method based on adaptive self-attention transformation network
CN113486988B (en) * 2021-08-04 2022-02-15 广东工业大学 Point cloud completion device and method based on adaptive self-attention transformation network
CN114663619A (en) * 2022-02-24 2022-06-24 清华大学 Three-dimensional point cloud object prediction method and device based on self-attention mechanism
CN116051633A (en) * 2022-12-15 2023-05-02 清华大学 3D point cloud target detection method and device based on weighted relation perception
CN116051633B (en) * 2022-12-15 2024-02-13 清华大学 3D point cloud target detection method and device based on weighted relation perception
CN117671131A (en) * 2023-10-20 2024-03-08 南京邮电大学 Industrial part three-dimensional point cloud repairing method and device based on deep learning

Also Published As

Publication number Publication date
CN112785526B (en) 2023-12-05

Similar Documents

Publication Publication Date Title
Lee et al. Context-aware synthesis and placement of object instances
CN112785526B (en) Three-dimensional point cloud restoration method for graphic processing
CN109377448B (en) Face image restoration method based on generation countermeasure network
CN111063021B (en) Method and device for establishing three-dimensional reconstruction model of space moving target
Chen et al. The face image super-resolution algorithm based on combined representation learning
US11328172B2 (en) Method for fine-grained sketch-based scene image retrieval
CN110032925B (en) Gesture image segmentation and recognition method based on improved capsule network and algorithm
Lee et al. Deep architecture with cross guidance between single image and sparse lidar data for depth completion
CN112991350B (en) RGB-T image semantic segmentation method based on modal difference reduction
CN111553869B (en) Method for complementing generated confrontation network image under space-based view angle
CN111612008A (en) Image segmentation method based on convolution network
CN115690522B (en) Target detection method based on multi-pooling fusion channel attention and application thereof
CN111696196B (en) Three-dimensional face model reconstruction method and device
CN109740539B (en) 3D object identification method based on ultralimit learning machine and fusion convolution network
CN111768415A (en) Image instance segmentation method without quantization pooling
CN112767478B (en) Appearance guidance-based six-degree-of-freedom pose estimation method
Goncalves et al. Deepdive: An end-to-end dehazing method using deep learning
CN112801945A (en) Depth Gaussian mixture model skull registration method based on dual attention mechanism feature extraction
CN114782417A (en) Real-time detection method for digital twin characteristics of fan based on edge enhanced image segmentation
CN112668662B (en) Outdoor mountain forest environment target detection method based on improved YOLOv3 network
Yin et al. [Retracted] Virtual Reconstruction Method of Regional 3D Image Based on Visual Transmission Effect
CN112149528A (en) Panorama target detection method, system, medium and equipment
Dinh et al. Feature engineering and deep learning for stereo matching under adverse driving conditions
Ogura et al. Improving the visibility of nighttime images for pedestrian recognition using in‐vehicle camera
CN112767539B (en) Image three-dimensional reconstruction method and system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant