WO2022165876A1 - 一种基于wgan的无监督多视角三维点云联合配准方法 - Google Patents

一种基于wgan的无监督多视角三维点云联合配准方法 Download PDF

Info

Publication number
WO2022165876A1
WO2022165876A1 PCT/CN2021/077770 CN2021077770W WO2022165876A1 WO 2022165876 A1 WO2022165876 A1 WO 2022165876A1 CN 2021077770 W CN2021077770 W CN 2021077770W WO 2022165876 A1 WO2022165876 A1 WO 2022165876A1
Authority
WO
WIPO (PCT)
Prior art keywords
point
point cloud
matrix
matching
dimensional
Prior art date
Application number
PCT/CN2021/077770
Other languages
English (en)
French (fr)
Inventor
王耀南
彭伟星
张辉
毛建旭
朱青
刘敏
赵佳文
江一鸣
吴昊天
Original Assignee
湖南大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202110165409.9A external-priority patent/CN112837356B/zh
Application filed by 湖南大学 filed Critical 湖南大学
Publication of WO2022165876A1 publication Critical patent/WO2022165876A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/344Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the invention relates to the technical field of machine vision, in particular to an unsupervised multi-view three-dimensional point cloud joint registration method based on WGAN (Wasserstein Generative Adversarial Networks, a generative adversarial network deep learning model).
  • WGAN Widestrength Generation Networks, a generative adversarial network deep learning model
  • Intelligent manufacturing technology is the driving force for the integration of manufacturing industrialization and informatization.
  • Today, the aviation manufacturing industry is also facing the transformation to intelligence.
  • robots have attracted extensive attention in the field of aviation manufacturing.
  • Aero-engine is the "heart" of the aircraft, and its performance is mainly limited by the manufacturing level of aero-engine blades.
  • Blade three-dimensional dimension measurement technology is of great significance to blade processing and quality inspection.
  • it is urgent to develop a 3D measurement robot and realize automatic measurement.
  • a feasible robot measurement solution is to use a laser scanner mounted on the end of an industrial robot to obtain a point cloud and reconstruct a 3D model, through which the 3D dimension data of the blade is measured.
  • Point cloud registration refers to the transformation of point clouds in different coordinate systems into a unified coordinate system, and is generally divided into three categories: coarse registration, fine registration and global registration. Coarse registration is generally used for two point clouds with a large difference in attitude; fine registration is used to improve the accuracy of coarse registration point clouds. More mature methods include ICP (Iterative Closest Point, iterative closest point) registration algorithm and ICP-based registration algorithm In the reconstruction process, there are often serious cumulative errors in the frame-by-frame registration of point cloud data, which affects the accuracy of the reconstruction model.
  • the global registration algorithm expects to scatter the accumulated error to each frame of data, thereby reducing the overall error. Whether it is fine registration or global registration, a better coarse registration result is required as an initialization parameter.
  • the pair of rough matching criteria depends on the size of the overlapping area of the point cloud, the saliency of the overlapping features, and the symmetry of the model itself.
  • the aero-engine blade is designed as a smooth and non-textured hyperboloid thin-walled special-shaped structure.
  • a smooth and non-textured hyperboloid thin-walled special-shaped structure When acquiring 3D point clouds, such a structure will lead to insufficient overlapping area between adjacent point clouds and weak texture features, making it difficult to obtain a good rough registration result, so the error of global registration is large, and it is impossible to accurately reconstruct the blade. 3D model.
  • the invention provides an unsupervised multi-view three-dimensional point cloud joint registration method based on WGAN.
  • the method can train and optimize the posture of each view on the WGAN framework, so that the optimized overall point cloud probability distribution and a priori model The probability distribution difference of the point cloud is minimized, which means that the registered point cloud model is close to the theoretical model to the greatest extent.
  • the present invention provides an unsupervised multi-view 3D point cloud joint registration method based on WGAN, including the following steps:
  • Step S2 down-sampling the point clouds of all perspectives: the point cloud P i is the relationship between the point cloud P i and the two adjacent perspectives before and after the processing respectively, and there are two adjacent point clouds.
  • a total of two Sub-point cloud downsampling that is, for each point cloud, downsample the point clouds of the adjacent viewpoints before and after:
  • N iL min ⁇ N i-1 /s,N i /s ⁇ (1)
  • N iR min ⁇ N i /s,N i+1 /s ⁇ (2)
  • N i-1 represents the number of points contained in the i-1 th point cloud
  • N i represents the number of points contained in the i th point cloud
  • N i+1 represents the i+1 th point
  • the number of points contained in the cloud, s is a sampling parameter set artificially
  • Step S3 sampling from the standard model; sampling m samples from the standard model point set P s , and denoting them as standard samples
  • Step S4 train the generator network of the multi-view point cloud co-registration WGAN: convert each viewpoint point cloud to a unified coordinate system one by one, and fuse all the converted point clouds into a complete point cloud model P' , and uniformly sample P', sample m points from P' as generated samples Specifically include the following steps:
  • Step S41 design generator
  • Step S42 the generator network is trained
  • Step S5 train the discriminator network of the multi-view point cloud joint registration WGAN: generate samples with standard samples Make a judgment; it includes the following steps:
  • Step S51 designing a discriminator
  • Step S52 the discriminator network is trained
  • Step 6 Determine whether to terminate the training: set the training times of the generator and the discriminator to be M times, and terminate the training if it reaches M times, and return to step S4 if it does not reach M times.
  • the step S41 specifically includes the following steps:
  • Step S411 constructing a feature vector conversion network layer, to the point cloud Represents a N i ⁇ 3 matrix, and generates a high-dimensional feature vector F i ⁇ R N ⁇ D point by point, D represents the D-dimensional feature vector extracted from each point, and R N ⁇ D represents an N ⁇ D matrix;
  • Step S412 build a matching point calculation network layer, calculate matching points point by point: extract the feature matrix F (i-1) R and F corresponding to adjacent point clouds P i-1 and P i+1 through high-dimensional feature vector conversion (i+1)L ; calculate the matching probability of P i and P i-1 and P i+1 respectively, obtain the matching point pair set respectively
  • Step S413 filtering out the outliers based on the attention mechanism: calculating the correlation measure sim ij between the transposition of the i-th posture obtained in the previous iteration and the matching point pair C ij , where j represents the index;
  • Step S414 joint registration to obtain the closed-form solution T of the attitude: calculate the relative attitude and constraint conditions of the point cloud according to the current matching point pair and its weight, and obtain the only optimal solution of the relative attitude optimization of the point cloud, that is, the optimal attitude;
  • Step S415 generating a point cloud model and sampling: according to the optimal posture, convert the point clouds of each viewpoint into a unified coordinate system one by one, fuse them into a complete point cloud model P', and perform uniform sampling on P'.
  • step S411 is specifically:
  • the network consists of 4 EdgeConv layers and a convolutional layer Conv, using each feature As a vertex, calculate K-nearest neighbor KNN for each point, connect its K-nearest neighbors as edges, and construct a graph structure, D in represents the dimension of the input feature vector, Represents a D in -dimensional real vector;
  • Each edge is used as the input of the multi-layer perceptron MLP, and the D out dimension feature is output after the ReLU activation function;
  • the feature dimension output by the first EdgeConv layer is 64
  • the feature dimension output by the second EdgeConv layer is 64
  • the feature dimension output by the third EdgeConv layer is 128, and the feature dimension output by the fourth EdgeConv layer
  • the number is 256; the N ⁇ 512-dimensional features obtained by splicing the features extracted by the four EdgeConv layers are used as the input of Conv, and the feature matrix F i ⁇ R N ⁇ 1024 is output after the ReLU activation function, and R N ⁇ 1024 represents N ⁇ 1024 dimensions Matrix of real numbers.
  • step S412 is specifically:
  • P i is to deal with the relationship with the two adjacent perspectives before and after respectively, perform point cloud downsampling twice, and correspondingly extract two different feature matrices through the high-dimensional feature layer, namely and represents an N iL ⁇ 1024-dimensional real matrix, Represents a N iR ⁇ 1024-dimensional real matrix;
  • the matching point between P i and P i+1 is specifically: the input is The output is
  • ⁇ iR F iR + ⁇ (F iR , F (i+1)L ), and
  • ⁇ (i+1)L F (i+1)L + ⁇ (F (i+1)L , F iR ),
  • ⁇ (F iR , F (i+1)L ) is the residual variation of the Transformer to adjust the feature F iR to a "condition" F (i+1)L through learning
  • ⁇ (F (i+)L , F iR ) is the residual variation of the Transformer that adjusts the feature F (i+1)L to a "conditional" F iR through learning;
  • ⁇ iR(j) represents the jth row of ⁇ iR , that is, the eigenvector corresponding to the point p ij
  • T represents the matrix transposition
  • softmax is a probability normalization processing function
  • an average matching point cp ij is generated for p ij ⁇ P i :
  • the set of matching points obtained by the point cloud Pi in Pi +1 is denoted as CP i
  • the pair of matching points (pi ij , cp ij ) is denoted as C ij
  • the pair of matching points constitutes a set C iR ;
  • the matching points of P i and P i-1 can be realized according to the above process, and the matching point pair set C iL is obtained; C iR and C iL constitute the matching point pair to constitute the set C i ; the process of finding matching points for each pair of adjacent viewing angles is It can be implemented according to the above process.
  • step S413 is specifically:
  • the softmax function is introduced to normalize sim ij so that the sum of the weights of all matching point pairs is 1:
  • w ij represents the matching point weight, represents an exponential function whose variable is sim ij .
  • step S414 is specifically:
  • R i ⁇ SO(3) is the transpose of the rotation matrix
  • t i ⁇ R 1 ⁇ 3 is the transpose of the translation
  • R l ⁇ 3 represents the L ⁇ 3 dimension real number matrix
  • Constraint 1 is expressed as:
  • I represents the identity matrix, and det represents the determinant
  • s.t. represents the constraint condition
  • the Lagrangian multiplier method is used to deal with the equality constraint problem, and the augmented Lagrangian function is
  • represents the artificially set parameter, take 0.001
  • is the adjustable parameter of this layer of neural network, Take the result of the previous iteration, Y represents the Lagrange multiplier;
  • the subproblem of T is a quadratic convex optimization problem, and its derivative is 0 to find its minimum value, that is
  • step S415 is specifically:
  • the point clouds of each viewpoint are converted into a unified coordinate system one by one:
  • Sampling P' uniformly: record the sampling point set as S 2 , and S 2 is initialized as an empty set; randomly sample a seed point seed and put it into S 2 ; in the set P'-S 2 , find a distance set S 2 the most far points; finally sample m points from P' as samples
  • the step S3 specifically includes the following steps:
  • Step S31 denote the standard model point set as P s , the sampling point set as S 1 , and S 1 is initialized as an empty set;
  • Step S32 randomly sample a seed point seed and put it into S 1 ;
  • Step S33 in the set P s -S 1 , find a point farthest from the set S 1 , wherein the distance from the point to the set S 1 is the minimum point distance from the point to S 1 ;
  • Step S34 repeat step S33 until m samples are sampled, which are recorded as standard samples
  • the step S42 specifically includes the following steps:
  • Step S421 the down-sampled point cloud one by one Input to the high-dimensional feature extraction layer with shared weights to obtain the feature matrix F i ⁇ R N ⁇ 1024 corresponding to the point cloud Pi ;
  • Step S422 the feature matrices F iR and F (i+1)L of adjacent viewing angles are input to the matching point pair generation network pair by pair to obtain the matching point set CP i of the point cloud P i ;
  • Step S423, using the points of all viewing angles and their matching points as input, and using joint registration to find the closed-form solution T of the attitude;
  • Step S424 converting all point clouds to a unified coordinate system through the obtained T, and merging them into a point cloud model P';
  • Step S425, sample m points from P' as generated samples
  • Step S426, adjust the generator network parameters:
  • g ⁇ represents the gradient with respect to ⁇
  • represents the network parameters of the generator
  • f ⁇ represents the discriminator
  • represents the network parameters of the discriminator
  • v (i) represents the ith generated sample
  • represents the step size
  • RMSProp represents a Momentum-based optimization algorithm.
  • the step S51 is specifically:
  • the WGAN network trains the discriminator network f ⁇ with the parameter ⁇ and the last layer is not a nonlinear activation layer, and makes L as large as possible under the condition that ⁇ does not exceed a certain range.
  • the expression of L is as follows:
  • L approximates the Wasserstein distance between the real distribution P r and the generated distribution P g , that is, the Wasserstein distance is used to quantitatively measure the difference between the two distributions, p represents the sample, represents the expectation of the true distribution P r , represents the generation distribution P g ;
  • the discriminator uses a fully connected multi-layer perceptron with a four-layer fully connected structure with 3 ReLU activation functions; the input is the coordinates of a point, that is, the input dimension is 3 and the output dimension is 1.
  • the step S52 specifically includes the following steps:
  • Step S521 generating samples of m points uniformly sampled from the generated point cloud model one by one input into the discriminator network f ⁇ ;
  • Step S532 standard samples of m points uniformly sampled from the standard model one by one input into the discriminator network f ⁇ ;
  • Step S533 adjust the parameters of the discriminator network to generate samples with standard samples To discriminate;
  • the discriminator network parameters are as follows:
  • g ⁇ represents the gradient with respect to ⁇
  • u (i) represents the ith standard sample
  • f ⁇ represents the discriminator
  • represents the network parameters of the discriminator
  • RMSProp represents a momentum-based optimization algorithm
  • clip() represents the parameter ⁇ .
  • the absolute value is truncated to no more than a fixed constant c.
  • the neural network involved in the present invention is an unsupervised neural network, which only needs to know the theoretical model of the modeling object in advance, and does not require A large amount of annotation information and a large number of samples, the training is simple and fast; (3) The network can be run in real time without considering the generalization ability of the network; (4) Compared with the traditional multi-view registration method, the designed network directly calculates each view. Compared with the conversion relationship of the same reference coordinate system, there is neither a bias to a certain angle of view nor a cumulative error; (5) The result after training can be used as the initial value of fine registration, and the registration accuracy is high.
  • Fig. 1 is a kind of algorithm realization flow chart of the unsupervised multi-view three-dimensional point cloud joint registration method based on WGAN of the present invention
  • FIG. 2 is a schematic diagram of the overall network structure of the WGAN for joint registration in a preferred embodiment of the unsupervised multi-view three-dimensional point cloud joint registration method based on WGAN of the present invention
  • FIG. 3 is a schematic diagram of the generator network structure of WGAN in a preferred embodiment of a WGAN-based unsupervised multi-view three-dimensional point cloud joint registration method
  • FIG. 4 is a schematic diagram of a network structure of a high-dimensional feature extraction layer involved in a generator in a preferred embodiment of a WGAN-based unsupervised multi-view three-dimensional point cloud joint registration method;
  • 5(a) is a schematic diagram of the EdgeConv layer involved in the high-dimensional feature extraction layer in a preferred embodiment of a WGAN-based unsupervised multi-view 3D point cloud joint registration method;
  • Fig. 5(b) is a schematic diagram of the graph constructed by K-adjacent in Fig. 5(a);
  • FIG. 6 is a schematic diagram of the transformer network structure of the matching point generation layer involved in the generator in a preferred embodiment of a WGAN-based unsupervised multi-view three-dimensional point cloud joint registration method;
  • FIG. 7(a) is a schematic diagram of the attention involved in the transformer network in a preferred embodiment of a WGAN-based unsupervised multi-view 3D point cloud joint registration method
  • Figure 7(b) is a schematic diagram of the Multi-head attention sublayer involved in the transformer network of Figure 7(a).
  • the engine blades are processed based on the theoretical design model, so the processed blades should conform to the design model as much as possible.
  • the overall probability distribution of the point cloud after registration should also be as close as possible to the probability distribution of the theoretical model point cloud.
  • the present invention provides an unsupervised multi-view three-dimensional point cloud joint registration method based on WGAN.
  • the cloud co-registration method includes the following steps:
  • Step S2 down-sampling the point clouds of all perspectives: the point cloud P i is the relationship between the point cloud P i and the two adjacent perspectives before and after the processing respectively, and there are two adjacent point clouds.
  • a total of two Sub-point cloud downsampling that is, for each point cloud, downsample the point clouds of the adjacent viewpoints before and after:
  • N iL min ⁇ N i-1 /s,N i /s ⁇ (1)
  • N iR min ⁇ N i /s,N i+1 /s ⁇ (2)
  • N i-1 represents the number of points contained in the i-1 th point cloud
  • N i represents the number of points contained in the i th point cloud
  • N i+1 represents the i+1 th point
  • the number of points contained in the cloud, s is a sampling parameter set artificially
  • Step S3 sampling from the standard model; sampling m samples from the standard model point set P s , and denoting them as standard samples
  • Step S4 train the generator network of the multi-view point cloud co-registration WGAN: convert each viewpoint point cloud to a unified coordinate system one by one, and fuse all the converted point clouds into a complete point cloud model P' , and uniformly sample P', sample m points from P' as generated samples Specifically include the following steps:
  • Step S41 design generator
  • Step S42 the generator network is trained
  • Step S5 train the discriminator network of the multi-view point cloud joint registration WGAN: generate samples with standard samples Make a judgment; it includes the following steps:
  • Step S51 designing a discriminator
  • Step S52 the discriminator network is trained
  • Step 6 Determine whether to terminate the training: set the training times of the generator and the discriminator are both M times, if it reaches M times, terminate the training, and if it does not reach M times, go back to step S4.
  • step S41 specifically includes the following steps:
  • Step S411 constructing a feature vector conversion network layer, to the point cloud Represents a N i ⁇ 3 matrix, and generates a high-dimensional feature vector F i ⁇ R N ⁇ D point by point, D represents the D-dimensional feature vector extracted from each point, and R N ⁇ D represents an N ⁇ D matrix;
  • Step S412 build a matching point calculation network layer, calculate matching points point by point: extract the feature matrix F (i-1) R and F corresponding to adjacent point clouds P i-1 and P i+1 through high-dimensional feature vector conversion (i+1)L ; calculate the matching probability of P i and P i-1 and P i+1 respectively, obtain the matching point pair set respectively
  • Step S413 filtering out the outliers based on the attention mechanism: calculating the correlation measure sim ij between the transposition of the i-th posture obtained in the previous iteration and the matching point pair C ij , where j represents the index;
  • Step S414 joint registration to obtain the closed-form solution T of the attitude: calculate the relative attitude and constraint conditions of the point cloud according to the current matching point pair and its weight, and obtain the only optimal solution of the relative attitude optimization of the point cloud, that is, the optimal attitude;
  • Step S415 generating a point cloud model and sampling: according to the optimal posture, convert the point clouds of each viewpoint into a unified coordinate system one by one, fuse them into a complete point cloud model P', and perform uniform sampling on P'.
  • step S411 is specifically:
  • the network consists of 4 EdgeConv (an edge convolution operation) layers and a convolution layer Conv (vector convolution operation), using each feature As a vertex, calculate K-nearest neighbor KNN for each point, connect its K-nearest neighbors as edges, and construct a graph structure, D in represents the dimension of the input feature vector, Represents a D in -dimensional real vector;
  • Each edge is used as the input of the multi-layer perceptron MLP (Multilayer Perceptron), and the D out dimension feature is output after the ReLU (linear rectification function, Rectified Linear Unit) activation function;
  • ReLU linear rectification function, Rectified Linear Unit
  • the feature dimension output by the first EdgeConv layer is 64
  • the feature dimension output by the second EdgeConv layer is 64
  • the feature dimension output by the third EdgeConv layer is 128, and the feature dimension output by the fourth EdgeConv layer
  • the number is 256; the N ⁇ 512-dimensional features obtained by splicing the features extracted by the four EdgeConv layers are used as the input of Conv, and the output feature matrix F i ⁇ R N ⁇ 1024 after the ReLU activation function, R N ⁇ 1024 means N ⁇ 1024 dimensions Matrix of real numbers.
  • the step S412 is specifically:
  • P i is to deal with the relationship with the two adjacent perspectives before and after respectively, perform point cloud downsampling twice, and correspondingly extract two different feature matrices through the high-dimensional feature layer, namely and represents an N iL ⁇ 1024-dimensional real matrix, Represents a N iR ⁇ 1024-dimensional real matrix;
  • the matching point between P i and P i+1 is specifically: the input is The output is
  • ⁇ iR F iR + ⁇ (F iR , F (i+1)L ), and
  • ⁇ (i+1)L F (i+1)L + ⁇ (F (i+1)L , F iR ),
  • ⁇ (F iR , F (i+1)L ) is the residual variation of the Transformer to adjust the feature F iR to a "condition" F (i+1)L through learning
  • ⁇ (F (i+)L , F iR ) is the residual variation of the Transformer that adjusts the feature F (i+1)L to a "conditional" F iR through learning;
  • Transformer is a model based on the encoder-decoder (encoder-decoder) structure:
  • Encoder includes 6 encoders, 6 encoders are stacked in turn, each encoder includes a Multi-headattention (multi-head attention) sublayer and a feed-forward (feedforward) sublayer, each sublayer is There are residual connections between; each encoder output matrix is used as the input of the next encoder; the input of the first encoder is F iR , and the output of the last encoder is The encoding matrix of ; Multi-headattention sub-layer, weighted summation of the matrix obtained by 8 self-attention (self-attention) calculations;
  • Decoder includes 6 decoders, 6 decoders are stacked in sequence, each decoder includes two Multi-headattention sublayers and a feed-forward sublayer, and there are residual connections between each sublayer; each The decoder output matrix is used as the input to the next decoder; the input of the first decoder is F (i+1)L , and the output of the last encoder is The decoding matrix of ; the first Multi-headattention calculates the matrix obtained by 8 times of self-attention, and the second Multi-headattention calculates 8 times of encoder-decoder-attention (encoder-decoder attention) Weighted summation of the matrix; encoder-decoder-attention uses the output of the first sub-layer to create a Queries matrix (query matrix), and uses the output of the encoder to create a Keys (keyword) and Values (value) matrix;
  • ⁇ iR(j) represents the jth row of ⁇ iR , that is, the eigenvector corresponding to the point p ij ;
  • T represents the matrix transposition, and softmax is a probability normalization processing function;
  • an average matching point cp ij is generated for p ij ⁇ P i :
  • the set of matching points obtained by the point cloud Pi in Pi +1 is denoted as CP i
  • the pair of matching points (pi ij , cp ij ) is denoted as C ij
  • the pair of matching points constitutes a set C iR ;
  • the matching points of P i and P i-1 can be realized according to the above process, and the matching point pair set C iL is obtained; C iR and C iL constitute the matching point pair to constitute the set C i ; the process of finding matching points for each pair of adjacent viewing angles is It can be implemented according to the above process.
  • the step S413 is specifically:
  • the softmax function is introduced to normalize sim ij so that the sum of the weights of all matching point pairs is 1:
  • w ij represents the matching point weight, represents an exponential function whose variable is sim ij .
  • the step S414 is specifically:
  • R i ⁇ SO(3) is the transpose of the rotation matrix
  • t i ⁇ R 1 ⁇ 3 is the transpose of the translation
  • R l ⁇ 3 represents the L ⁇ 3 dimension real number matrix
  • Constraint 1 is expressed as:
  • I represents the identity matrix, and det represents the determinant
  • s.t. represents the constraint condition
  • the Lagrangian multiplier method is used to deal with the equality constraint problem, and the augmented Lagrangian function is
  • represents the artificially set parameter, take 0.001
  • is the adjustable parameter of this layer of neural network, Take the result of the previous iteration, Y represents the Lagrange multiplier;
  • SVD SingleValue Decomposition, singular value decomposition
  • the subproblem of T is a quadratic convex optimization problem, and its derivative is 0 to find its minimum value, that is
  • is an artificially set parameter (take 0.001)
  • is an adjustable parameter of this layer of neural network, Take the result of the previous iteration.
  • the step S415 is specifically:
  • the point clouds of each viewpoint are converted into a unified coordinate system one by one:
  • Sampling P' uniformly: record the sampling point set as S 2 , and S 2 is initialized as an empty set; randomly sample a seed point seed (seed point) and put it into S 2 ; in the set P'-S 2 , find a distance Set the farthest point of S2 ; finally sample m points from P' as samples
  • the step S416 is specifically:
  • the matching point pair generation network is used to obtain the matching point set CP i of the point cloud Pi ; the points of all perspectives and their matching points are used as input, and the closed-form solution T of the pose is obtained by joint registration. Convert all point clouds to a unified coordinate system through the obtained T, and fuse them into a point cloud model P'; sample m points from P' as the generated samples Let p' ij ⁇ P' obey the probability distribution P g . Keeping the network parameters of the discriminator f ⁇ unchanged, the loss of constructing the generator is:
  • the step S3 specifically includes the following steps:
  • Step S31 denote the standard model point set as P s , the sampling point set as S 1 , and S 1 is initialized as an empty set;
  • Step S32 randomly sample a seed point seed and put it into S 1 ;
  • Step S33 in the set P s -S 1 , find a point farthest from the set S 1 , wherein the distance from the point to the set S 1 is the minimum point distance from the point to S 1 ;
  • Step S34 repeat step S33 until m samples are sampled, which are recorded as standard samples
  • the step S42 specifically includes the following steps:
  • Step S421 the down-sampled point cloud one by one Input to the high-dimensional feature extraction layer with shared weights to obtain the feature matrix F i ⁇ R N ⁇ 1024 corresponding to the point cloud Pi ;
  • Step S422 the feature matrices F iR and F (i+1)L of adjacent viewing angles are input to the matching point pair generation network pair by pair to obtain the matching point set CP i of the point cloud P i ;
  • Step S423, using the points of all viewing angles and their matching points as input, and using joint registration to find the closed-form solution T of the attitude;
  • Step S424 converting all point clouds to a unified coordinate system through the obtained T, and merging them into a point cloud model P';
  • Step S425, sample m points from P' as generated samples
  • Step S426, adjust the generator network parameters:
  • g ⁇ represents the gradient with respect to ⁇
  • represents the network parameters of the generator
  • f ⁇ represents the discriminator
  • represents the network parameters of the discriminator
  • v (i) represents the ith generated sample
  • represents the step size
  • RMSProp represents a Momentum-based optimization algorithm.
  • the step S51 is specifically:
  • the WGAN network trains the discriminator network f ⁇ with the parameter ⁇ and the last layer is not a nonlinear activation layer, and makes L as large as possible under the condition that ⁇ does not exceed a certain range.
  • the expression of L is as follows:
  • L approximates the Wasserstein distance between the real distribution P r and the generated distribution P g , that is, the Wasserstein distance is used to quantitatively measure the difference between the two distributions, p represents the sample, represents the expectation of the true distribution P r , represents the generation distribution P g ;
  • the discriminator uses a fully connected multi-layer perceptron with a four-layer fully connected structure with 3 ReLU activation functions; the input is the coordinates of a point, that is, the input dimension is 3 and the output dimension is 1.
  • the step S52 specifically includes the following steps:
  • Step S521 generating samples of m points uniformly sampled from the generated point cloud model one by one input into the discriminator network f ⁇ ;
  • Step S532 standard samples of m points uniformly sampled from the standard model one by one input into the discriminator network f ⁇ ;
  • Step S533 adjust the parameters of the discriminator network to generate samples with standard samples To discriminate;
  • the discriminator network parameters are as follows:
  • g ⁇ represents the gradient with respect to ⁇
  • u (i) represents the ith standard sample
  • f ⁇ represents the discriminator
  • represents the network parameters of the discriminator
  • RMSProp represents a momentum-based optimization algorithm
  • clip() represents the parameter ⁇ .
  • the absolute value is truncated to no more than a fixed constant c.
  • the neural network involved in the present invention is an unsupervised neural network, which only needs to know the theoretical model of the modeling object in advance, and does not require A large amount of annotation information and a large number of samples, the training is simple and fast; (3) The network can be run in real time without considering the generalization ability of the network; (4) Compared with the traditional multi-view registration method, the designed network directly calculates each view. Compared with the conversion relationship of the same reference coordinate system, there is neither a bias to a certain angle of view nor a cumulative error; (5) The result after training can be used as the initial value of fine registration, and the registration accuracy is high.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

一种基于WGAN的无监督多视角三维点云联合配准方法包括如下步骤:步骤S1、获取不同视角的点云;步骤S1、对所有视角的点云进行下采样;步骤S3、从标准模型中采样;步骤S4、对多视角点云联合配准WGAN的生成器网络进行训练;步骤S5、对多视角点云联合配准WGAN的判别器网络进行训练;步骤S6、判断是否终止训练:设定生成器和判别器训练的次数均为M次,若达到M次则终止训练,若未达到M次则回到步骤S4。设计了一种多视角点云配准网络生成点云模型,相较于传统的配准方法,具有对初始化更强的鲁棒性,有利于在生产线上实时运行,既不存在视角偏置,也不存在累计误差。

Description

一种基于WGAN的无监督多视角三维点云联合配准方法
本申请要求于2021年02月06日提交中国专利局的中国专利申请的优先权,其中国专利申请为:申请号为202110165409.9,发明名称为“一种基于WGAN的无监督多视角三维点云联合配准方法”,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及机器视觉技术领域,特别涉及一种基于WGAN(Wasserstein Generative Adversarial Networks,一种生成式对抗网络深度学习模型)的无监督多视角三维点云联合配准方法。
背景技术
智能制造技术是实现制造业工业化与信息化融合的动力。如今,航空制造业也正面临着向智能化的转型。机器人作为智能制造技术的载体之一,在航空制造领域引起了广泛关注。航空发动机是飞机的“心脏”,其性能主要受到航空发动机叶片制造水平的限制。叶片三维尺寸测量技术对叶片加工和质量检测具有重要意义。为了满足叶片日益复杂的测量需求,亟需开发三维测量机器人并实现自动测量。随着光学测量技术的发展,一个可行的机器人测量方案是:利用装载在工业机器人末端的激光扫描仪获取点云并重建出三维模型,通过该模型测量叶片的三维尺寸数据。
在该测量方案中,准确和完整地重建叶片的三维模型是精密测量叶片的必要前提,而配准多个视角的三维点云是重建过程主要需要解决的问题。点云配准指将不同坐标系下的点云变换到统一的坐标系下,一般分为三类:粗配准、精细配准和全局配准。粗配准一般用于两个姿态相差较大的点云;精细配准用于提升粗配准点云的精度,较为成熟的方法包括ICP(Iterative  Closest Point,迭代最近点)配准算法以及基于ICP的改进算法;重建过程中,逐帧配准点云数据往往存在严重的累计误差,影响重建模型的精度。全局配准算法则期望将累计误差分散到每一帧数据,从而减小整体的误差。无论是精配准还是全局配准,都需要一个较好的粗配准结果作为初始化参数。粗配准则对依赖于点云的重叠区域大小,重叠部分特征的显著性,以及模型本身的对称性等。
航空发动机叶片为了满足气动性能,被设计成光滑无纹理的双曲面薄壁异形结构。在获取三维点云时,这样的结构会导致相邻点云间重叠区域不足,且纹理特征微弱,难以取得较好的粗配准结果,因此使得全局配准的误差大,无法精密重建出叶片的三维模型。
发明内容
本发明提供了一种基于WGAN的无监督多视角三维点云联合配准方法,该方法在WGAN的框架上可训练并优化各个视角的姿态,使得优化后整体的点云概率分布与先验模型点云的概率分布差异最小化,即表示配准后的点云模型最大程度的接近理论模型。
为了达到上述目的,本发明提供的一种基于WGAN的无监督多视角三维点云联合配准方法,包括如下步骤:
步骤S1、获取不同视角的点云:从不同视角进行扫描,扫描后获得I个点云P={P 1,...,P i,...,P I},
Figure PCTCN2021077770-appb-000001
表示第i个点云;N i表示第i个点云所包含的点的个数,P ij表示第i个点云中的第j个点,p n=R 3,R表示实数,R 3表示笛卡尔三维坐标系;
步骤S2、对所有视角的点云进行下采样:点云P i为分别处理与前后相邻两个视角的关系,存在两个相邻的点云,处理不同相邻点云时,共进行 两次点云下采样,即对于每个点云,分别对前后相邻视角点云下采样:
对于P i-1,对P i和P i-1进行随机采样,采样数量N iL为:
N iL=min{N i-1/s,N i/s}    (1)
对于P i+1,对P i和P i+1进行随机采样,采样数量N iR为:
N iR=min{N i/s,N i+1/s}    (2)
式中,N i-1表示第i-1个点云所包含的点的个数,N i表示第i个点云所包含的点的个数,N i+1表示第i+1个点云所包含的点的个数,s为人为设定的采样参数;
步骤S3、从标准模型中采样;从标准模型点集P s中采样出m个样本,记为标准样本
Figure PCTCN2021077770-appb-000002
步骤S4、对多视角点云联合配准WGAN的生成器网络进行训练:将各个视角点云逐一转换到统一的坐标系下,将所有转换后的点云融合成一个完整的点云模型P',并对P'进行均匀采样,从P'采样m个点作为生成样本
Figure PCTCN2021077770-appb-000003
具体包括如下步骤:
步骤S41、设计生成器;
步骤S42、生成器网络进行训练;
步骤S5、对多视角点云联合配准WGAN的判别器网络进行训练:对生成样本
Figure PCTCN2021077770-appb-000004
与标准样本
Figure PCTCN2021077770-appb-000005
进行判别;具体包括如下步骤:
步骤S51、设计判别器;
步骤S52、判别器网络进行训练;
步骤6:判断是否终止训练:设定生成器和判别器训练的次数均为M 次,若达到M次则终止训练,若未达到M次则回到步骤S4。
优选地,所述步骤S41具体包括如下步骤:
步骤S411、构建特征向量转换网络层,对点云
Figure PCTCN2021077770-appb-000006
表示N i×3矩阵,逐点生成高维特征向量F i∈R N×D,D表示每个点提取的D维的特征向量,R N×D表示N×D矩阵;
步骤S412、构建匹配点计算网络层,逐点计算匹配点:提取相邻点云P i-1与P i+1对应的经过高维特征向量转换的特征矩阵F (i-1)R和F (i+1)L;分别计算P i与P i-1及P i+1的匹配概率,分别得到匹配点对集合
Figure PCTCN2021077770-appb-000007
步骤S413、滤除基于注意力机制的外点:计算上一次迭代得到的第i个姿态的转置与匹配点对C ij之间的相关性度量sim ij,j表示索引;
步骤S414、联合配准求姿态的闭式解T:根据当前匹配点对及其权重计算点云的相对姿态及约束条件,获得点云的相对姿态优化唯一最优解,即最优姿态;
步骤S415、生成点云模型并进行采样:根据最优姿态,将各个视角点云逐一转换到统一的坐标系下,融合成一个完整的点云模型P',并对P'进行均匀采样。
优选地,所述步骤S411具体为:
网络由4个EdgeConv层和一个卷积层Conv构成,用每一个特征
Figure PCTCN2021077770-appb-000008
作为顶点,对每个点计算K-最近邻KNN,连接其K近邻作为边,构建图结构,D in表示输入特征向量的维数,
Figure PCTCN2021077770-appb-000009
表示D in维实数向量;
对于顶点
Figure PCTCN2021077770-appb-000010
其与某个邻近点
Figure PCTCN2021077770-appb-000011
所构成的边为
Figure PCTCN2021077770-appb-000012
将每一条边作为多层感知机MLP的输入,经过ReLU激活函数后输出D out维特征;
将所有边的特征通过最大池化层,得到对应于顶点
Figure PCTCN2021077770-appb-000013
的特征
Figure PCTCN2021077770-appb-000014
表示D out维实数向量;
输入特征矩阵
Figure PCTCN2021077770-appb-000015
表示N×D in维实数矩阵,输出特征矩阵
Figure PCTCN2021077770-appb-000016
表示N×D out维实数矩阵;
其中,第一个EdgeConv层输出的特征维数为64,第二个EdgeConv层输出的特征维数为64,第三个EdgeConv层输出的特征维数为128,第四个EdgeConv层输出的特征维数为256;将四个EdgeConv层提取的特征拼接得到的N×512维特征作为Conv的输入,过ReLU激活函数后输出特征矩阵F i∈R N×1024,R N×1024表示N×1024维实数矩阵。
优选地,所述步骤S412具体为:
P i为分别处理与前后相邻两个视角的关系,进行了两次点云下采样,对应地经过高维特征层提取两个不同的特征矩阵,即
Figure PCTCN2021077770-appb-000017
Figure PCTCN2021077770-appb-000018
表示N iL×1024维实数矩阵,
Figure PCTCN2021077770-appb-000019
表示N iR×1024维实数矩阵;
P i与P i+1的匹配点具体为:输入为
Figure PCTCN2021077770-appb-000020
输出为
Φ iR=F iR+φ(F iR,F (i+1)L),
Figure PCTCN2021077770-appb-000021
Φ (i+1)L=F (i+1)L+φ(F (i+1)L,F iR),
Figure PCTCN2021077770-appb-000022
其中,φ(F iR,F (i+1)L)为Transformer将特征F iR通过学习调整到一个“条件”F (i+1)L的残差变化量,φ(F (i+)L,F iR)为Transformer将特征F (i+1)L通过学习调整到一个“条件”F iR的残差变化量;
对于点p ij∈P i,P i+1的每一个点与p ij成为匹配点的概率所构成矩阵为
Figure PCTCN2021077770-appb-000023
Φ iR(j)表示Φ iR的第j行,即对应于点p ij的特征向量,T表示矩阵转置,softmax是一种概率归一化处理函数;
根据上述匹配点概率,为p ij∈P i生成一个平均匹配点cp ij
Figure PCTCN2021077770-appb-000024
点云P i在P i+1中得到的匹配点集合记为CP i,匹配点对(p ij,cp ij)记作C ij,匹配点对构成集合C iR
P i与P i-1的匹配点均可按照上述过程实现,得到匹配点对集合C iL;C iR与C iL构成匹配点对构成集合C i;每对相邻视角寻找匹配点的过程均可按照上述过程实现。
优选地,所述步骤S413具体为:
计算
Figure PCTCN2021077770-appb-000025
与匹配点对C ij之间的相关性度量sim ij
Figure PCTCN2021077770-appb-000026
其中
Figure PCTCN2021077770-appb-000027
表示上一次迭代得到的第i个姿态的转置,||.|| F表示Frobenius范数,σ是一个正实数,防止sim ij趋向于无穷大;
引入softmax函数对sim ij进行归一化,使所有匹配点对权重之和为1:
Figure PCTCN2021077770-appb-000028
式中,w ij表示匹配点权重,
Figure PCTCN2021077770-appb-000029
表示变量为sim ij的指数函数。
优选地,所述步骤S414具体为:
根据当前匹配点对及其权重计算点云的相对姿态,所有匹配点对欧式距离之和d为:
Figure PCTCN2021077770-appb-000030
其中,
Figure PCTCN2021077770-appb-000031
为第i个视角姿态转换矩阵的转置,R i∈SO(3)为旋转矩阵的转置,t i∈R 1×3为平移量的转置,R l×3表示L×3维实数矩阵;
构造矩阵
Figure PCTCN2021077770-appb-000032
将式(7)表示成
Figure PCTCN2021077770-appb-000033
Figure PCTCN2021077770-appb-000034
T=[T 1,...,T I] T,将式(8)转化成矩阵函数表达式:
Figure PCTCN2021077770-appb-000035
所求得的姿态T=[T 1,...,T I] T需要一个固定的初始坐标系,以保证优化问题仅存在唯一的最优解;
为式(9)添加约束条件T 1=T 0,T 0是任意的满足R 0∈SO(3)的姿态;为了简化网络结构,取T 0为标准3D模型的坐标系;由于T=[T 1,...,T I] T,构造矩阵A=[I 4 0 4×4(I-1)],I 4表示 4×4的单位矩阵,0 4×4(I-1)表示4×4(I-1)的零矩阵;
约束条件1表示成:
T 1=AT=T 0    (10)
同时,旋转矩阵
Figure PCTCN2021077770-appb-000036
约束条件2表示成:
Figure PCTCN2021077770-appb-000037
式中,I表示单位矩阵,det表示行列式;
令b=[I 3 0 3×1],则
R i=bT i    (12)
令R=[R 1...R i...R I],则
R=BT    (13)
其中,
Figure PCTCN2021077770-appb-000038
Figure PCTCN2021077770-appb-000039
将式(9)的等式约束最优问题表示成:
Figure PCTCN2021077770-appb-000040
式中,s.t.表示约束条件;
采用拉格朗日乘子法处理等式约束问题,增广的拉格朗日函数为
Figure PCTCN2021077770-appb-000041
式中,λ表示人为设定的参数,取0.001,μ作为该层神经网络的可调参数,
Figure PCTCN2021077770-appb-000042
取上一次迭代的结果,Y表示拉格朗日乘子;
采用交替乘子法求解上述问题的最优解,得到如下迭代关系
Figure PCTCN2021077770-appb-000043
关于
Figure PCTCN2021077770-appb-000044
的子问题可以用下式求解:
Figure PCTCN2021077770-appb-000045
Figure PCTCN2021077770-appb-000046
SVD表示奇异值分解;
关于T的子问题是一个二次凸优化问题,令其导数为0求其最小值,即
Figure PCTCN2021077770-appb-000047
则有
Figure PCTCN2021077770-appb-000048
优选地,所述步骤S415具体为:
根据上个步骤求得的姿态T,将各个视角点云逐一转换到统一的坐标系下:
p' ij=p ijT i    (21)
将所有转换后的点云融合成一个完整的点云模型P';
对P'进行均匀采样:记采样点集为S 2,S 2初始化为空集;随机采样一个种子点seed,放入S 2;在集合P'-S 2里,找一个距离集合S 2最远的点;最终从P'中采样m个点作为样本
Figure PCTCN2021077770-appb-000049
所述步骤S3具体包括入下步骤:
步骤S31、记标准模型点集为P s,采样点集为S 1,S 1初始化为空集;
步骤S32、随机采样一个种子点seed,放入S 1
步骤S33、在集合P s-S 1里,找一个距离集合S 1最远的点,其中点到集合S 1的距离为该点到S 1最小的点距;
步骤S34、重复步骤S33,直到采样出m个样本,记为标准样本
Figure PCTCN2021077770-appb-000050
优选地,所述步骤S42具体包括如下步骤:
步骤S421、逐一将下采样的点云
Figure PCTCN2021077770-appb-000051
输入到共享权值的高维特征提取层,得到对应点云P i的特征矩阵F i∈R N×1024
步骤S422、将相邻视角的特征矩阵F iR和F (i+1)L逐对输入到匹配点对生成网络,得到点云P i的匹配点集CP i
步骤S423、将所有视角的点及其匹配点作为输入,利用联合配准求姿态的闭式解T;
步骤S424、将所有点云通过求得的T转换到统一坐标系下,融合成点云模型P';
步骤S425、从P'采样m个点作为生成样本
Figure PCTCN2021077770-appb-000052
步骤S426、调节生成器网络参数:
Figure PCTCN2021077770-appb-000053
θ←θ-α·RMSProp(θ,g θ)    (23)
g θ表示关于θ的梯度,θ表示生成器的网络参数,f ω表示判别器,ω表示判别器的网络参数,v (i)表示第i个生成样本,α表示步长,RMSProp表示一种基于动量的优化算法。
优选地,所述步骤S51具体为:
WGAN网络通过训练含参数ω、最后一层不是非线性激活层的判别器网络f ω,在ω不超过某个范围的条件下,使得L尽可能最大,L表达式如下:
Figure PCTCN2021077770-appb-000054
式中,L近似真实分布P r和生成分布P g之间的Wasserstein距离,即用Wasserstein距离定量的衡量两个分布的差异度,p表示样本,
Figure PCTCN2021077770-appb-000055
表示真实分布P r的期望,
Figure PCTCN2021077770-appb-000056
表示生成分布P g
判别器采用全连接实现的多层感知机,结构为四层全连接,伴有3个ReLU激活函数;输入为点的坐标,即输入维度为3,输出维度为1。
优选地,所述步骤S52具体包括如下步骤:
步骤S521、逐一将从生成点云模型均匀采样的m个点的生成样本
Figure PCTCN2021077770-appb-000057
输入到判别器网络f ω中;
步骤S532、逐一将从标准模型均匀采样的m个点的标准样本
Figure PCTCN2021077770-appb-000058
输入到判别器网络f ω中;
步骤S533、调节判别器网络参数,对生成样本
Figure PCTCN2021077770-appb-000059
与标准样本
Figure PCTCN2021077770-appb-000060
进行判别;判别器网络参数具体为:
Figure PCTCN2021077770-appb-000061
ω←ω+α·RMSProp(ω,g ω)    (26)
ω←clip(ω,-c,c)    (27)
g ω表示关于ω的梯度,u (i)表示第i个标准样本,f ω表示判别器,ω表示判别器的网络参数,RMSProp表示一种基于动量的优化算法,clip()表示参数ω的绝对值截断到不超过一个固定的常数c。
本发明能够取得下列有益效果:
(1)对视角姿态的初始化鲁棒;(2)相比于全监督神经网络,本发明所涉及的神经网络为无监督神经网络,只需要预先知道建模对象的理论模型即可,不需要大量的标注信息和大量样本,训练简单快速;(3)无需考虑网络的泛化能力,可实时运行;(4)相比于传统的多视角配准方法,所设计的网络直接求每一个视角相对于同一参考坐标系的转换关系,既不存在对某个视角的偏置,也不存在累计误差;(5)训练后的结果可作为精配准的初始值,配准精度高。
附图说明
图1为本发明的一种基于WGAN的无监督多视角三维点云联合配准方法的算法实现流程图;
图2为本发明的一种基于WGAN的无监督多视角三维点云联合配准方法中的一较佳实施例的联合配准的WGAN总体网络结构示意图;
图3为本发明的一种基于WGAN的无监督多视角三维点云联合配准方法的一较佳实施例中WGAN的生成器网络结构的示意图;
图4为本发明的一种基于WGAN的无监督多视角三维点云联合配准方法的一较佳实施例中生成器所涉及的高维特征提取层网络结构的示意图;
图5(a)为本发明的一种基于WGAN的无监督多视角三维点云联合配准方法的一较佳实施例中高维特征提取层所涉及的EdgeConv层的示意图;
图5(b)为图5(a)中通过K-邻近构造的图的示意图;
图6为本发明的一种基于WGAN的无监督多视角三维点云联合配准方法的一较佳实施例中生成器所涉及的匹配点生成层的transformer网络结构的示意图;
图7(a)为本发明的一种基于WGAN的无监督多视角三维点云联合配准方法的一较佳实施例中transformer网络所涉及的attention的示意图;
图7(b)为图7(a)的transformer网络所涉及的Multi-head attention子层的示意图。
具体实施方式
为使本发明要解决的技术问题、技术方案和优点更加清楚,下面将结合附图及具体实施例进行详细描述。
发动机叶片是以理论设计模型为参考加工的,所以加工成型的叶片应尽可能的符合设计模型,理论上配准后点云的整体概率分布也应该尽可能的接近理论模型点云的概率分布。
本发明针对现有的问题,提供了一种基于WGAN的无监督多视角三维点云联合配准方法,如图1及图2所示,本发明的一种基于WGAN的无监督多视角三维点云联合配准方法包括如下步骤:
步骤S1、获取不同视角的点云:从不同视角进行扫描,扫描后获得I个点云P={P 1,...,P i,...,P I},
Figure PCTCN2021077770-appb-000062
表示第i个点云;N i表示第i个点云所包含的点的个数,P ij表示第i个点云中的第j个点,p n=R 3,R表示实数,R 3表示笛卡尔三维坐标系;
步骤S2、对所有视角的点云进行下采样:点云P i为分别处理与前后相邻两个视角的关系,存在两个相邻的点云,处理不同相邻点云时,共进行两次点云下采样,即对于每个点云,分别对前后相邻视角点云下采样:
对于P i-1,对P i和P i-1进行随机采样,采样数量N iL为:
N iL=min{N i-1/s,N i/s}    (1)
对于P i+1,对P i和P i+1进行随机采样,采样数量N iR为:
N iR=min{N i/s,N i+1/s}    (2)
式中,N i-1表示第i-1个点云所包含的点的个数,N i表示第i个点云所包含的点的个数,N i+1表示第i+1个点云所包含的点的个数,s为人为设定的采样参数;
步骤S3、从标准模型中采样;从标准模型点集P s中采样出m个样本,记为标准样本
Figure PCTCN2021077770-appb-000063
步骤S4、对多视角点云联合配准WGAN的生成器网络进行训练:将各个视角点云逐一转换到统一的坐标系下,将所有转换后的点云融合成一个完整的点云模型P',并对P'进行均匀采样,从P'采样m个点作为生成样本
Figure PCTCN2021077770-appb-000064
具体包括如下步骤:
步骤S41、设计生成器;
步骤S42、生成器网络进行训练;
步骤S5、对多视角点云联合配准WGAN的判别器网络进行训练:对生成样本
Figure PCTCN2021077770-appb-000065
与标准样本
Figure PCTCN2021077770-appb-000066
进行判别;具体包括如下步骤:
步骤S51、设计判别器;
步骤S52、判别器网络进行训练;
步骤6:判断是否终止训练:设定生成器和判别器训练的次数均为M次,若达到M次则终止训练,若未达到M次则回到步骤S4。
参考图3中WGAN的生成器网络结构的示意图,其中,所述步骤S41具体包括如下步骤:
步骤S411、构建特征向量转换网络层,对点云
Figure PCTCN2021077770-appb-000067
表示N i×3矩阵,逐点生成高维特征向量F i∈R N×D,D表示每个点提取的D维的特征向量,R N×D表示N×D矩阵;
步骤S412、构建匹配点计算网络层,逐点计算匹配点:提取相邻点云P i-1与P i+1对应的经过高维特征向量转换的特征矩阵F (i-1)R和F (i+1)L;分别计算P i与P i-1及P i+1的匹配概率,分别得到匹配点对集合
Figure PCTCN2021077770-appb-000068
步骤S413、滤除基于注意力机制的外点:计算上一次迭代得到的第i个姿态的转置与匹配点对C ij之间的相关性度量sim ij,j表示索引;
步骤S414、联合配准求姿态的闭式解T:根据当前匹配点对及其权重计算点云的相对姿态及约束条件,获得点云的相对姿态优化唯一最优解,即最优姿态;
步骤S415、生成点云模型并进行采样:根据最优姿态,将各个视角点云逐一转换到统一的坐标系下,融合成一个完整的点云模型P',并对P'进行均匀采样。
参考图4、图5(a)及图5(b),所述步骤S411具体为:
网络由4个EdgeConv(一种边卷积操作)层和一个卷积层Conv(向量卷积运算)构成,用每一个特征
Figure PCTCN2021077770-appb-000069
作为顶点,对每个点计算K-最近邻KNN,连接其K近邻作为边,构建图结构,D in表示输入特征向量的维数,
Figure PCTCN2021077770-appb-000070
表示D in维实数向量;
对于顶点
Figure PCTCN2021077770-appb-000071
其与某个邻近点
Figure PCTCN2021077770-appb-000072
所构成的边为
Figure PCTCN2021077770-appb-000073
将每一条边作为多层感知机MLP(MultilayerPerceptron)的输入,经过ReLU(线性整流函数,Rectified Linear Unit)激活函数后输出D out维特征;
将所有边的特征通过最大池化层,得到对应于顶点
Figure PCTCN2021077770-appb-000074
的特征
Figure PCTCN2021077770-appb-000075
表示D out维实数向量;
输入特征矩阵
Figure PCTCN2021077770-appb-000076
表示N×D in维实数矩阵,输出特征矩阵
Figure PCTCN2021077770-appb-000077
表示N×D out维实数矩阵;
其中,第一个EdgeConv层输出的特征维数为64,第二个EdgeConv层输出的特征维数为64,第三个EdgeConv层输出的特征维数为128,第四个EdgeConv层输出的特征维数为256;将四个EdgeConv层提取的特征拼接得到的N×512维特征作为Conv的输入,过ReLU激活函数后输出特征矩阵F i∈R N×1024,R N×1024表示N×1024维实数矩阵。
所述步骤S412具体为:
P i为分别处理与前后相邻两个视角的关系,进行了两次点云下采样,对应地经过高维特征层提取两个不同的特征矩阵,即
Figure PCTCN2021077770-appb-000078
Figure PCTCN2021077770-appb-000079
表示N iL×1024维实数矩阵,
Figure PCTCN2021077770-appb-000080
表示N iR×1024维实数矩阵;
P i与P i+1的匹配点具体为:输入为
Figure PCTCN2021077770-appb-000081
输出为
Φ iR=F iR+φ(F iR,F (i+1)L),
Figure PCTCN2021077770-appb-000082
Φ (i+1)L=F (i+1)L+φ(F (i+1)L,F iR),
Figure PCTCN2021077770-appb-000083
其中,φ(F iR,F (i+1)L)为Transformer将特征F iR通过学习调整到一个“条件”F (i+1)L的残差变化量,φ(F (i+)L,F iR)为Transformer将特征F (i+1)L通过学习调整到一个“条件”F iR的残差变化量;
参考图6、图7(a)及7(b),Transformer为基于encoder-decoder(编码器-解码器)结构的模型:
Encoder(编码器)包括6个编码器,6个编码器依次叠加,每个编码器包含一个Multi-headattention(多头注意力)子层和一个feed-forward(前馈)子层,每个子层之间有残差连接;每个编码器输出矩阵作为下一个编码器的输入;第一个编码器的输入为F iR,最后一个编码器的输出为
Figure PCTCN2021077770-appb-000084
的编码矩阵;Multi-headattention子层,将8次self-attention(自注意力)计算得到的矩阵进行加权求和;
Decoder(解码器)包括6个解码器,6个解码器依次叠加,每个解码器包含两个Multi-headattention子层和一个feed-forward子层,每个子层之间有残差连接;每个解码器输出矩阵作为下一个解码器的输入;第一个解码器的输入为F (i+1)L,最后一个编码器的输出为
Figure PCTCN2021077770-appb-000085
的解码矩阵;第一个Multi-headattention将8次self-attention计算得到的矩阵进行加权求和,第二个Multi-headattention将8次encoder-decoder-attention(编码器-解码器注意力)计算得到的矩阵进行加权求和;encoder-decoder-attention用第一个子层的输出创建Queries矩阵(查询矩阵),用encoder的输出创建Keys(关键字)和Values(值)矩阵;
对于点p ij∈P i,P i+1的每一个点与p ij成为匹配点的概率所构成矩阵为
Figure PCTCN2021077770-appb-000086
Φ iR(j)表示Φ iR的第j行,即对应于点p ij的特征向量;T表示矩阵转置,softmax是一种概率归一化处理函数;
根据上述匹配点概率,为p ij∈P i生成一个平均匹配点cp ij
Figure PCTCN2021077770-appb-000087
点云P i在P i+1中得到的匹配点集合记为CP i,匹配点对(p ij,cp ij)记作C ij,匹配点对构成集合C iR
P i与P i-1的匹配点均可按照上述过程实现,得到匹配点对集合C iL;C iR与C iL构成匹配点对构成集合C i;每对相邻视角寻找匹配点的过程均可按照上述过程实现。
所述步骤S413具体为:
计算
Figure PCTCN2021077770-appb-000088
与匹配点对C ij之间的相关性度量sim ij
Figure PCTCN2021077770-appb-000089
其中
Figure PCTCN2021077770-appb-000090
表示上一次迭代得到的第i个姿态的转置,||.|| F表示Frobenius(一种矩阵范数)范数,σ是一个正实数,防止sim ij趋向于无穷大;
引入softmax函数对sim ij进行归一化,使所有匹配点对权重之和为1:
Figure PCTCN2021077770-appb-000091
式中,w ij表示匹配点权重,
Figure PCTCN2021077770-appb-000092
表示变量为sim ij的指数函数。
所述步骤S414具体为:
根据当前匹配点对及其权重计算点云的相对姿态,所有匹配点对欧式距离之和d为:
Figure PCTCN2021077770-appb-000093
其中,
Figure PCTCN2021077770-appb-000094
为第i个视角姿态转换矩阵的转置,R i∈SO(3)为旋转矩阵的转置,t i∈R 1×3为平移量的转置,R l×3表示L×3维实数矩阵;
构造矩阵
Figure PCTCN2021077770-appb-000095
将式(7)表示成
Figure PCTCN2021077770-appb-000096
Figure PCTCN2021077770-appb-000097
T=[T 1,...,T I] T,将式(8)转化成矩阵函数表达式:
Figure PCTCN2021077770-appb-000098
所求得的姿态T=[T 1,...,T I] T需要一个固定的初始坐标系,以保证优化问 题仅存在唯一的最优解;
为式(9)添加约束条件T 1=T 0,T 0是任意的满足R 0∈SO(3)的姿态;为了简化网络结构,取T 0为标准3D模型的坐标系;由于T=[T 1,...,T I] T,构造矩阵A=[I 4 0 4×4(I-1)],I 4表示4×4的单位矩阵,0 4×4(I-1)表示4×4(I-1)的零矩阵;
约束条件1表示成:
T 1=AT=T 0    (10)
同时,旋转矩阵
Figure PCTCN2021077770-appb-000099
约束条件2表示成:
Figure PCTCN2021077770-appb-000100
式中,I表示单位矩阵,det表示行列式;
令b=[I 3 0 3×1],则
R i=bT i,    (12)
令R=[R 1...R i...R I],则
R=BT,    (13)
其中,
Figure PCTCN2021077770-appb-000101
Figure PCTCN2021077770-appb-000102
将式(9)的等式约束最优问题表示成:
Figure PCTCN2021077770-appb-000103
式中,s.t.表示约束条件;
采用拉格朗日乘子法处理等式约束问题,增广的拉格朗日函数为
Figure PCTCN2021077770-appb-000104
式中,λ表示人为设定的参数,取0.001,μ作为该层神经网络的可调参数,
Figure PCTCN2021077770-appb-000105
取上一次迭代的结果,Y表示拉格朗日乘子;
采用交替乘子法求解上述问题的最优解,得到如下迭代关系
Figure PCTCN2021077770-appb-000106
关于
Figure PCTCN2021077770-appb-000107
的子问题可以用下式求解:
Figure PCTCN2021077770-appb-000108
Figure PCTCN2021077770-appb-000109
SVD(SingularValue Decomposition,奇异值分解)表示奇异值分解;
关于T的子问题是一个二次凸优化问题,令其导数为0求其最小值,即
Figure PCTCN2021077770-appb-000110
则有
Figure PCTCN2021077770-appb-000111
上式中λ是人为设定的参数(取0.001),μ作为该层神经网络的可调参数,
Figure PCTCN2021077770-appb-000112
取上一次迭代的结果。
所述步骤S415具体为:
根据上个步骤求得的姿态T,将各个视角点云逐一转换到统一的坐标系下:
p' ij=p ijT i    (21)
将所有转换后的点云融合成一个完整的点云模型P';
对P'进行均匀采样:记采样点集为S 2,S 2初始化为空集;随机采样一个种子点seed(种子点),放入S 2;在集合P'-S 2里,找一个距离集合S 2最远的点;最终从P'中采样m个点作为样本
Figure PCTCN2021077770-appb-000113
所述步骤S416具体为:
逐一将下采样的点云
Figure PCTCN2021077770-appb-000114
输入到共享权值的高维特征提取层,得到对应点云P i的特征矩阵F i∈R N×1024;将相邻视角的特征矩阵F iR和F (i+1)L逐对输入到匹配点对生成网络,得到点云P i的匹配点集CP i;将所有视角的点及其匹配点作为输入,利用联合配准求姿态的闭式解T。将所有点云通过求得的T转换到统一坐标系下,融合成点云模型P';从P'采样m个点作为生成的样本
Figure PCTCN2021077770-appb-000115
设p' ij∈P'服从概率分布P g。保持判别器f ω的网络参数不变,构造生成器的loss为:
Figure PCTCN2021077770-appb-000116
所述步骤S3具体包括入下步骤:
步骤S31、记标准模型点集为P s,采样点集为S 1,S 1初始化为空集;
步骤S32、随机采样一个种子点seed,放入S 1
步骤S33、在集合P s-S 1里,找一个距离集合S 1最远的点,其中点到集合S 1的距离为该点到S 1最小的点距;
步骤S34、重复步骤S33,直到采样出m个样本,记为标准样本
Figure PCTCN2021077770-appb-000117
优选地,所述步骤S42具体包括如下步骤:
步骤S421、逐一将下采样的点云
Figure PCTCN2021077770-appb-000118
输入到共享权值的高维特征提取层,得到对应点云P i的特征矩阵F i∈R N×1024
步骤S422、将相邻视角的特征矩阵F iR和F (i+1)L逐对输入到匹配点对生成网络,得到点云P i的匹配点集CP i
步骤S423、将所有视角的点及其匹配点作为输入,利用联合配准求姿态的闭式解T;
步骤S424、将所有点云通过求得的T转换到统一坐标系下,融合成点云模型P';
步骤S425、从P'采样m个点作为生成样本
Figure PCTCN2021077770-appb-000119
步骤S426、调节生成器网络参数:
Figure PCTCN2021077770-appb-000120
θ←θ-α·RMSProp(θ,g θ)    (24)
g θ表示关于θ的梯度,θ表示生成器的网络参数,f ω表示判别器,ω表示判别器的网络参数,v (i)表示第i个生成样本,α表示步长,RMSProp表 示一种基于动量的优化算法。
所述步骤S51具体为:
WGAN网络通过训练含参数ω、最后一层不是非线性激活层的判别器网络f ω,在ω不超过某个范围的条件下,使得L尽可能最大,L表达式如下:
Figure PCTCN2021077770-appb-000121
式中,L近似真实分布P r和生成分布P g之间的Wasserstein距离,即用Wasserstein距离定量的衡量两个分布的差异度,p表示样本,
Figure PCTCN2021077770-appb-000122
表示真实分布P r的期望,
Figure PCTCN2021077770-appb-000123
表示生成分布P g
判别器采用全连接实现的多层感知机,结构为四层全连接,伴有3个ReLU激活函数;输入为点的坐标,即输入维度为3,输出维度为1。
所述步骤S52具体包括如下步骤:
步骤S521、逐一将从生成点云模型均匀采样的m个点的生成样本
Figure PCTCN2021077770-appb-000124
输入到判别器网络f ω中;
步骤S532、逐一将从标准模型均匀采样的m个点的标准样本
Figure PCTCN2021077770-appb-000125
输入到判别器网络f ω中;
步骤S533、调节判别器网络参数,对生成样本
Figure PCTCN2021077770-appb-000126
与标准样本
Figure PCTCN2021077770-appb-000127
进行判别;判别器网络参数具体为:
Figure PCTCN2021077770-appb-000128
ω←ω+α·RMSProp(ω,g ω)    (26)
ω←clip(ω,-c,c)    (27)
g ω表示关于ω的梯度,u (i)表示第i个标准样本,f ω表示判别器,ω表示判别器的网络参数,RMSProp表示一种基于动量的优化算法,clip()表示参数ω的绝对值截断到不超过一个固定的常数c。
本发明能够取得下列有益效果:
(1)对视角姿态的初始化鲁棒;(2)相比于全监督神经网络,本发明所涉及的神经网络为无监督神经网络,只需要预先知道建模对象的理论模型即可,不需要大量的标注信息和大量样本,训练简单快速;(3)无需考虑网络的泛化能力,可实时运行;(4)相比于传统的多视角配准方法,所设计的网络直接求每一个视角相对于同一参考坐标系的转换关系,既不存在对某个视角的偏置,也不存在累计误差;(5)训练后的结果可作为精配准的初始值,配准精度高。
以上所述是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明所述原理的前提下,还可以作出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。

Claims (10)

  1. 一种基于WGAN的无监督多视角三维点云联合配准方法,其特征在于,包括如下步骤:
    步骤S1、获取不同视角的点云:从不同视角进行扫描,扫描后获得I个点云P={P 1,...,P i,...,P I},
    Figure PCTCN2021077770-appb-100001
    表示第i个点云;N i表示第i个点云所包含的点的个数,P ij表示第i个点云中的第j个点,p n=R 3,R表示实数,R 3表示笛卡尔三维坐标系;
    步骤S2、对所有视角的点云进行下采样:点云P i为分别处理与前后相邻两个视角的关系,存在两个相邻的点云,处理不同相邻点云时,共进行两次点云下采样,即对于每个点云,分别对前后相邻视角点云下采样:
    对于P i-1,对P i和P i-1进行随机采样,采样数量N iL为:
    N iL=min{N i-1/s,N i/s}  (1)
    对于P i+1,对P i和P i+1进行随机采样,采样数量N iR为:
    N iR=min{N i/s,N i+1/s}  (2)
    式中,N i-1表示第i-1个点云所包含的点的个数,N i表示第i个点云所包含的点的个数,N i+1表示第i+1个点云所包含的点的个数,s为人为设定的采样参数;
    步骤S3、从标准模型中采样;从标准模型点集P s中采样出m个样本,记为标准样本
    Figure PCTCN2021077770-appb-100002
    步骤S4、对多视角点云联合配准WGAN的生成器网络进行训练:将各个视角点云逐一转换到统一的坐标系下,将所有转换后的点云融合成一 个完整的点云模型P',并对P'进行均匀采样,从P'采样m个点作为生成样本
    Figure PCTCN2021077770-appb-100003
    具体包括如下步骤:
    步骤S41、设计生成器;
    步骤S42、生成器网络进行训练;
    步骤S5、对多视角点云联合配准WGAN的判别器网络进行训练:对生成样本
    Figure PCTCN2021077770-appb-100004
    与标准样本
    Figure PCTCN2021077770-appb-100005
    进行判别;具体包括如下步骤:
    步骤S51、设计判别器;
    步骤S52、判别器网络进行训练;
    步骤6:判断是否终止训练:设定生成器和判别器训练的次数均为M次,若达到M次则终止训练,若未达到M次则回到步骤S4。
  2. 根据权利要求1所述的一种基于WGAN的无监督多视角三维点云联合配准方法,其特征在于,所述步骤S41具体包括如下步骤:
    步骤S411、构建特征向量转换网络层,对点云
    Figure PCTCN2021077770-appb-100006
    表示N i×3矩阵,逐点生成高维特征向量F i∈R N×D,D表示每个点提取的D维的特征向量,R N×D表示N×D矩阵;
    步骤S412、构建匹配点计算网络层,逐点计算匹配点:提取相邻点云P i-1与P i+1对应的经过高维特征向量转换的特征矩阵F (i-1)R和F (i+1)L;分别计算P i与P i-1及P i+1的匹配概率,分别得到匹配点对集合
    Figure PCTCN2021077770-appb-100007
    步骤S413、滤除基于注意力机制的外点:计算上一次迭代得到的第i个姿态的转置与匹配点对C ij之间的相关性度量sim ij,j表示索引;
    步骤S414、联合配准求姿态的闭式解T:根据当前匹配点对及其权重 计算点云的相对姿态及约束条件,获得点云的相对姿态优化唯一最优解,即最优姿态;
    步骤S415、生成点云模型并进行采样:根据最优姿态,将各个视角点云逐一转换到统一的坐标系下,融合成一个完整的点云模型P',并对P'进行均匀采样。
  3. 根据权利要求2所述的一种基于WGAN的无监督多视角三维点云联合配准方法,其特征在于,所述步骤S411具体为:
    网络由4个EdgeConv层和一个卷积层Conv构成,用每一个特征
    Figure PCTCN2021077770-appb-100008
    作为顶点,对每个点计算K-最近邻KNN,连接其K近邻作为边,构建图结构,D in表示输入特征向量的维数,
    Figure PCTCN2021077770-appb-100009
    表示D in维实数向量;
    对于顶点
    Figure PCTCN2021077770-appb-100010
    其与某个邻近点
    Figure PCTCN2021077770-appb-100011
    所构成的边为
    Figure PCTCN2021077770-appb-100012
    将每一条边作为多层感知机MLP的输入,经过ReLU激活函数后输出D out维特征;
    将所有边的特征通过最大池化层,得到对应于顶点
    Figure PCTCN2021077770-appb-100013
    的特征
    Figure PCTCN2021077770-appb-100014
    表示D out维实数向量;
    输入特征矩阵
    Figure PCTCN2021077770-appb-100015
    表示N×D in维实数矩阵,输出特征矩阵
    Figure PCTCN2021077770-appb-100016
    表示N×D out维实数矩阵;
    其中,第一个EdgeConv层输出的特征维数为64,第二个EdgeConv层输出的特征维数为64,第三个EdgeConv层输出的特征维数为128,第四个EdgeConv层输出的特征维数为256;将四个EdgeConv层提取的特征 拼接得到的N×512维特征作为Conv的输入,过ReLU激活函数后输出特征矩阵F i∈RN×1024,R N×1024表示N×1024维实数矩阵。
  4. 根据权利要求3所述的一种基于WGAN的无监督多视角三维点云联合配准方法,其特征在于,所述步骤S412具体为:
    P i为分别处理与前后相邻两个视角的关系,进行了两次点云下采样,对应地经过高维特征层提取两个不同的特征矩阵,即
    Figure PCTCN2021077770-appb-100017
    Figure PCTCN2021077770-appb-100018
    表示N iL×1024维实数矩阵,
    Figure PCTCN2021077770-appb-100019
    表示N iR×1024维实数矩阵;
    P i与P i+1的匹配点具体为:输入为
    Figure PCTCN2021077770-appb-100020
    输出为
    Figure PCTCN2021077770-appb-100021
    Figure PCTCN2021077770-appb-100022
    其中,φ(F iR,F (i+1)L)为Transformer将特征F iR通过学习调整到一个“条件”F (i+1)L的残差变化量,φ(F (i+)L,F iR)为Transformer将特征F (i+1)L通过学习调整到一个“条件”F iR的残差变化量;
    对于点p ij∈P i,P i+1的每一个点与p ij成为匹配点的概率所构成矩阵为
    Figure PCTCN2021077770-appb-100023
    Φ iR(j)表示Φ iR的第j行,即对应于点p ij的特征向量,T表示矩阵转置,soft max是一种概率归一化处理函数;
    根据上述匹配点概率,为p ij∈P i生成一个平均匹配点cp ij
    Figure PCTCN2021077770-appb-100024
    点云P i在P i+1中得到的匹配点集合记为CP i,匹配点对(p ij,cp ij)记作C ij,匹配点对构成集合C iR
    P i与P i-1的匹配点均可按照上述过程实现,得到匹配点对集合C iL;C iR与C iL构成匹配点对构成集合C i;每对相邻视角寻找匹配点的过程均可按照上述过程实现。
  5. 根据权利要求4所述的一种基于WGAN的无监督多视角三维点云联合配准方法,其特征在于,所述步骤S413具体为:
    计算
    Figure PCTCN2021077770-appb-100025
    与匹配点对C ij之间的相关性度量sim ij
    Figure PCTCN2021077770-appb-100026
    其中
    Figure PCTCN2021077770-appb-100027
    表示上一次迭代得到的第i个姿态的转置,||.|| F表示Frobenius范数,σ是一个正实数,防止sim ij趋向于无穷大;
    引入soft max函数对sim ij进行归一化,使所有匹配点对权重之和为1:
    Figure PCTCN2021077770-appb-100028
    式中,w ij表示匹配点权重,
    Figure PCTCN2021077770-appb-100029
    表示变量为sim ij的指数函数。
  6. 根据权利要求2所述的一种基于WGAN的无监督多视角三维点云联合配准方法,其特征在于,所述步骤S414具体为:
    根据当前匹配点对及其权重计算点云的相对姿态,所有匹配点对欧式距离之和d为:
    Figure PCTCN2021077770-appb-100030
    其中,
    Figure PCTCN2021077770-appb-100031
    为第i个视角姿态转换矩阵的转置,R i∈SO(3)为旋转矩阵的转置,t i∈R 1×3为平移量的转置,R l×3表示L×3维实数矩阵;
    构造矩阵
    Figure PCTCN2021077770-appb-100032
    将式(7)表示成
    Figure PCTCN2021077770-appb-100033
    Figure PCTCN2021077770-appb-100034
    T=[T 1,...,T I] T,将式(8)转化成矩阵函数表达式:
    Figure PCTCN2021077770-appb-100035
    所求得的姿态T=[T 1,...,T I] T需要一个固定的初始坐标系,以保证优化问题仅存在唯一的最优解;
    为式(9)添加约束条件T 1=T 0,T 0是任意的满足R 0∈SO(3)的姿态;为了简化网络结构,取T 0为标准3D模型的坐标系;由于T=[T 1,...,T I] T,构造矩阵A=[I 4 0 4×4(I-1)],I 4表示4×4的单位矩阵,0 4×4(I-1)表示4×4(I-1)的 零矩阵;
    约束条件1表示成:
    T 1=AT=T 0  (10)
    同时,旋转矩阵
    Figure PCTCN2021077770-appb-100036
    约束条件2表示成:
    Figure PCTCN2021077770-appb-100037
    式中,I表示单位矩阵,det表示行列式;
    令b=[I 3 0 3×1],则
    R i=bT i  (12)
    令R=[R 1...R i...R I],则
    R=BT  (13)
    其中,
    Figure PCTCN2021077770-appb-100038
    Figure PCTCN2021077770-appb-100039
    将式(9)的等式约束最优问题表示成:
    Figure PCTCN2021077770-appb-100040
    Figure PCTCN2021077770-appb-100041
    AT=T 0.  (14)
    式中,s.t.表示约束条件;
    采用拉格朗日乘子法处理等式约束问题,增广的拉格朗日函数为
    Figure PCTCN2021077770-appb-100042
    Figure PCTCN2021077770-appb-100043
    式中,λ表示人为设定的参数,取0.001,μ作为该层神经网络的可调参数,
    Figure PCTCN2021077770-appb-100044
    取上一次迭代的结果,Y表示拉格朗日乘子;
    采用交替乘子法求解上述问题的最优解,得到如下迭代关系
    Figure PCTCN2021077770-appb-100045
    关于
    Figure PCTCN2021077770-appb-100046
    的子问题可以用下式求解:
    Figure PCTCN2021077770-appb-100047
    Figure PCTCN2021077770-appb-100048
    SVD表示奇异值分解;
    关于T的子问题是一个二次凸优化问题,令其导数为0求其最小值,即
    Figure PCTCN2021077770-appb-100049
    则有
    Figure PCTCN2021077770-appb-100050
  7. 根据权利要求6所述的一种基于WGAN的无监督多视角三维点云联合配准方法,其特征在于,所述步骤S415具体为:
    根据上个步骤求得的姿态T,将各个视角点云逐一转换到统一的坐标系下:
    p′ ij=p ijT i  (21)
    将所有转换后的点云融合成一个完整的点云模型P';
    对P'进行均匀采样:记采样点集为S 2,S 2初始化为空集;随机采样一个种子点seed,放入S 2;在集合P'-S 2里,找一个距离集合S 2最远的点;最终从P'中采样m个点作为样本
    Figure PCTCN2021077770-appb-100051
    所述步骤S3具体包括入下步骤:
    步骤S31、记标准模型点集为P s,采样点集为S 1,S 1初始化为空集;
    步骤S32、随机采样一个种子点seed,放入S 1
    步骤S33、在集合P s-S 1里,找一个距离集合S 1最远的点,其中点到集合S 1的距离为该点到S 1最小的点距;
    步骤S34、重复步骤S33,直到采样出m个样本,记为标准样本
    Figure PCTCN2021077770-appb-100052
  8. 根据权利要求7所述的一种基于WGAN的无监督多视角三维点云联合配准方法,其特征在于,所述步骤S42具体包括如下步骤:
    步骤S421、逐一将下采样的点云
    Figure PCTCN2021077770-appb-100053
    输入到共享权值的高维特征提取层,得到对应点云P i的特征矩阵F i∈R N×1024
    步骤S422、将相邻视角的特征矩阵F iR和F (i+1)L逐对输入到匹配点对生成网络,得到点云P i的匹配点集CP i
    步骤S423、将所有视角的点及其匹配点作为输入,利用联合配准求姿态的闭式解T;
    步骤S424、将所有点云通过求得的T转换到统一坐标系下,融合成点云模型P';
    步骤S425、从P'采样m个点作为生成样本
    Figure PCTCN2021077770-appb-100054
    步骤S426、调节生成器网络参数:
    Figure PCTCN2021077770-appb-100055
    θ←θ-α·RMSProp(θ,g θ)  (23)
    g θ表示关于θ的梯度,θ表示生成器的网络参数,f ω表示判别器,ω表示判别器的网络参数,v (i)表示第i个生成样本,α表示步长,RMSProp表示一种基于动量的优化算法。
  9. 根据权利要求8所述的一种基于WGAN的无监督多视角三维点云联合配准方法,其特征在于,所述步骤S51具体为:
    WGAN网络通过训练含参数ω、最后一层不是非线性激活层的判别器网络f ω,在ω不超过某个范围的条件下,使得L尽可能最大,L表达式如下:
    Figure PCTCN2021077770-appb-100056
    式中,L近似真实分布P r和生成分布P g之间的Wasserstein距离,即用Wasserstein距离定量的衡量两个分布的差异度,p表示样本,
    Figure PCTCN2021077770-appb-100057
    表示真实分布P r的期望,
    Figure PCTCN2021077770-appb-100058
    表示生成分布P g
    判别器采用全连接实现的多层感知机,结构为四层全连接,伴有3个ReLU激活函数;输入为点的坐标,即输入维度为3,输出维度为1。
  10. 根据权利要求9所述的一种基于WGAN的无监督多视角三维点云联合配准方法,其特征在于,所述步骤S52具体包括如下步骤:
    步骤S521、逐一将从生成点云模型均匀采样的m个点的生成样本
    Figure PCTCN2021077770-appb-100059
    输入到判别器网络f ω中;
    步骤S532、逐一将从标准模型均匀采样的m个点的标准样本
    Figure PCTCN2021077770-appb-100060
    输入到判别器网络f ω中;
    步骤S533、调节判别器网络参数,对生成样本
    Figure PCTCN2021077770-appb-100061
    与标准样本
    Figure PCTCN2021077770-appb-100062
    进行判别;判别器网络参数具体为:
    Figure PCTCN2021077770-appb-100063
    ω←ω+α·RMSProp(ω,g ω)  (26)
    ω←clip(ω,-c,c)  (27)
    g ω表示关于ω的梯度,u (i)表示第i个标准样本,f ω表示判别器,ω表示判别器的网络参数,RMSProp表示一种基于动量的优化算法,clip()表示参数ω的绝对值截断到不超过一个固定的常数c。
PCT/CN2021/077770 2021-02-06 2021-02-25 一种基于wgan的无监督多视角三维点云联合配准方法 WO2022165876A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110165409.9 2021-02-06
CN202110165409.9A CN112837356B (zh) 2021-02-06 一种基于wgan的无监督多视角三维点云联合配准方法

Publications (1)

Publication Number Publication Date
WO2022165876A1 true WO2022165876A1 (zh) 2022-08-11

Family

ID=75932553

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/077770 WO2022165876A1 (zh) 2021-02-06 2021-02-25 一种基于wgan的无监督多视角三维点云联合配准方法

Country Status (1)

Country Link
WO (1) WO2022165876A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115795579A (zh) * 2022-12-23 2023-03-14 岭南师范学院 一种无特征复杂曲面误差分析的快速坐标对齐方法
CN115908517A (zh) * 2023-01-06 2023-04-04 广东工业大学 一种基于对应点匹配矩阵优化的低重叠点云配准方法
CN116258817A (zh) * 2023-02-16 2023-06-13 浙江大学 一种基于多视图三维重建的自动驾驶数字孪生场景构建方法和系统
CN116310401A (zh) * 2022-12-19 2023-06-23 南京航空航天大学 一种基于单演特征联合稀疏表示的跨视角sar识别方法
CN116299367A (zh) * 2023-05-18 2023-06-23 中国测绘科学研究院 一种多激光空间标定方法
CN117456001A (zh) * 2023-12-21 2024-01-26 广州泽亨实业有限公司 一种基于点云配准的工件姿态检测方法
CN117495932A (zh) * 2023-12-25 2024-02-02 国网山东省电力公司滨州供电公司 一种电力设备异源点云配准方法及系统
CN117557733A (zh) * 2024-01-11 2024-02-13 江西啄木蜂科技有限公司 基于超分辨率的自然保护区三维重建方法
CN117557733B (zh) * 2024-01-11 2024-05-24 江西啄木蜂科技有限公司 基于超分辨率的自然保护区三维重建方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190122378A1 (en) * 2017-04-17 2019-04-25 The United States Of America, As Represented By The Secretary Of The Navy Apparatuses and methods for machine vision systems including creation of a point cloud model and/or three dimensional model based on multiple images from different perspectives and combination of depth cues from camera motion and defocus with various applications including navigation systems, and pattern matching systems as well as estimating relative blur between images for use in depth from defocus or autofocusing applications
CN109872354A (zh) * 2019-01-28 2019-06-11 深圳市易尚展示股份有限公司 基于非线性优化的多视角点云配准方法及系统
CN111210466A (zh) * 2020-01-14 2020-05-29 华志微创医疗科技(北京)有限公司 多视角点云配准方法、装置以及计算机设备
CN111899353A (zh) * 2020-08-11 2020-11-06 长春工业大学 一种基于生成对抗网络的三维扫描点云孔洞填补方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190122378A1 (en) * 2017-04-17 2019-04-25 The United States Of America, As Represented By The Secretary Of The Navy Apparatuses and methods for machine vision systems including creation of a point cloud model and/or three dimensional model based on multiple images from different perspectives and combination of depth cues from camera motion and defocus with various applications including navigation systems, and pattern matching systems as well as estimating relative blur between images for use in depth from defocus or autofocusing applications
CN109872354A (zh) * 2019-01-28 2019-06-11 深圳市易尚展示股份有限公司 基于非线性优化的多视角点云配准方法及系统
CN111210466A (zh) * 2020-01-14 2020-05-29 华志微创医疗科技(北京)有限公司 多视角点云配准方法、装置以及计算机设备
CN111899353A (zh) * 2020-08-11 2020-11-06 长春工业大学 一种基于生成对抗网络的三维扫描点云孔洞填补方法

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116310401A (zh) * 2022-12-19 2023-06-23 南京航空航天大学 一种基于单演特征联合稀疏表示的跨视角sar识别方法
CN115795579A (zh) * 2022-12-23 2023-03-14 岭南师范学院 一种无特征复杂曲面误差分析的快速坐标对齐方法
CN115908517A (zh) * 2023-01-06 2023-04-04 广东工业大学 一种基于对应点匹配矩阵优化的低重叠点云配准方法
CN115908517B (zh) * 2023-01-06 2023-05-12 广东工业大学 一种基于对应点匹配矩阵优化的低重叠点云配准方法
CN116258817B (zh) * 2023-02-16 2024-01-30 浙江大学 一种基于多视图三维重建的自动驾驶数字孪生场景构建方法和系统
CN116258817A (zh) * 2023-02-16 2023-06-13 浙江大学 一种基于多视图三维重建的自动驾驶数字孪生场景构建方法和系统
CN116299367A (zh) * 2023-05-18 2023-06-23 中国测绘科学研究院 一种多激光空间标定方法
CN116299367B (zh) * 2023-05-18 2024-01-26 中国测绘科学研究院 一种多激光空间标定方法
CN117456001A (zh) * 2023-12-21 2024-01-26 广州泽亨实业有限公司 一种基于点云配准的工件姿态检测方法
CN117456001B (zh) * 2023-12-21 2024-04-09 广州泽亨实业有限公司 一种基于点云配准的工件姿态检测方法
CN117495932A (zh) * 2023-12-25 2024-02-02 国网山东省电力公司滨州供电公司 一种电力设备异源点云配准方法及系统
CN117495932B (zh) * 2023-12-25 2024-04-16 国网山东省电力公司滨州供电公司 一种电力设备异源点云配准方法及系统
CN117557733A (zh) * 2024-01-11 2024-02-13 江西啄木蜂科技有限公司 基于超分辨率的自然保护区三维重建方法
CN117557733B (zh) * 2024-01-11 2024-05-24 江西啄木蜂科技有限公司 基于超分辨率的自然保护区三维重建方法

Also Published As

Publication number Publication date
CN112837356A (zh) 2021-05-25

Similar Documents

Publication Publication Date Title
WO2022165876A1 (zh) 一种基于wgan的无监督多视角三维点云联合配准方法
Yang et al. Graduated non-convexity for robust spatial perception: From non-minimal solvers to global outlier rejection
CN111080627B (zh) 一种基于深度学习的2d+3d大飞机外形缺陷检测与分析方法
CN109410321B (zh) 基于卷积神经网络的三维重建方法
CN109800648B (zh) 基于人脸关键点校正的人脸检测识别方法及装置
CN110427877B (zh) 一种基于结构信息的人体三维姿态估算的方法
Mahendran et al. A mixed classification-regression framework for 3d pose estimation from 2d images
Yue et al. Hierarchical probabilistic fusion framework for matching and merging of 3-d occupancy maps
CN107169117B (zh) 一种基于自动编码器和dtw的手绘图人体运动检索方法
CN113205466A (zh) 一种基于隐空间拓扑结构约束的残缺点云补全方法
CN110992427B (zh) 一种形变物体的三维位姿估计方法及定位抓取系统
WO2023015799A1 (zh) 基于人工智能导盲的多模态融合障碍物检测方法及装置
CN113160287B (zh) 一种基于特征融合的复杂构件点云拼接方法及系统
CN112581515A (zh) 基于图神经网络的户外场景点云配准方法
CN113592927B (zh) 一种结构信息引导的跨域图像几何配准方法
CN113012122B (zh) 一种类别级6d位姿与尺寸估计方法及装置
CN116401794B (zh) 基于注意力引导的深度点云配准的叶片三维精确重建方法
CN110197503A (zh) 基于增强型仿射变换的非刚性点集配准方法
WO2024060395A1 (zh) 一种基于深度学习的高精度点云补全方法及装置
CN112750198A (zh) 一种基于非刚性点云的稠密对应预测方法
CN114612660A (zh) 一种基于多特征融合点云分割的三维建模方法
CN111260702B (zh) 激光三维点云与ct三维点云配准方法
CN116958420A (zh) 一种数字人教师三维人脸的高精度建模方法
CN112837356B (zh) 一种基于wgan的无监督多视角三维点云联合配准方法
CN106055244B (zh) 一种基于Kinect和语音的人机交互方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21923886

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21923886

Country of ref document: EP

Kind code of ref document: A1