CN115731275A - Non-rigid three-dimensional point cloud registration method and system based on attention mechanism - Google Patents

Non-rigid three-dimensional point cloud registration method and system based on attention mechanism Download PDF

Info

Publication number
CN115731275A
CN115731275A CN202211652660.9A CN202211652660A CN115731275A CN 115731275 A CN115731275 A CN 115731275A CN 202211652660 A CN202211652660 A CN 202211652660A CN 115731275 A CN115731275 A CN 115731275A
Authority
CN
China
Prior art keywords
point cloud
rigid
target
source
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211652660.9A
Other languages
Chinese (zh)
Inventor
汪金洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202211652660.9A priority Critical patent/CN115731275A/en
Publication of CN115731275A publication Critical patent/CN115731275A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

The invention relates to a non-rigid three-dimensional point cloud registration method and system based on an attention mechanism. The three-dimensional point cloud registration method comprises the steps of firstly using a self-adaptive instance normalization module as a normalization layer in a standard Transformer network, and then obtaining original point cloud data of a non-rigid target under two different actions as a source point cloud and a target point cloud respectively. And then extracting high-dimensional point-by-point features of the source point cloud and the target point cloud, and superposing the high-dimensional point-by-point features with the position codes to obtain initial feature embedding of the source point cloud and the target point cloud. And then, a multi-head cross attention module is used for carrying out linear-change matrix operation, a plurality of attention scores are spliced to obtain intermediate feature embedding of the source point cloud mapped with the target point cloud information, and then an adaptive instance normalization module is used for carrying out normalization processing on the intermediate feature embedding, so that the deformed point cloud and the source point cloud are kept consistent on identity features as much as possible, the surface is smoother, and the registration effect is further improved.

Description

Non-rigid three-dimensional point cloud registration method and system based on attention mechanism
Technical Field
The invention relates to the technical field of computer vision, in particular to a non-rigid three-dimensional point cloud registration method and system based on an attention mechanism.
Background
Due to the limited scanning range of the three-dimensional scanning equipment, the diversity of the shapes of the objects to be scanned and the influence of the environment, a complete data model of the object cannot be obtained through one-time scanning. Usually, a plurality of point clouds must be captured from different angles, each point cloud is associated with a different coordinate system, and translation misalignment and rotation misalignment may occur, and a complete data model is formed by calculating a suitable coordinate transformation to unify point cloud data scanned from different perspectives under the same coordinate system, or two point clouds with the same shape and size are unified under the same coordinate system by calculating a transformation matrix, and the key technology for realizing the process is point cloud registration.
Point cloud registration can be classified by object as rigid registration and non-rigid registration. The transformation parameters of rigid registration are usually represented by a low-dimensional transformation matrix, and global rotation and translation coordinate transformation is performed. The form of the non-rigid registration transformation is more complex, involving local rotational translation and non-rigid deformation, and cannot be described using simple transformation parameters. In practical applications, non-rigid point cloud registration faces many difficulties: first, due to the difference in the scanning angle or the movement of the scanned target, the overlapping portion of the two point clouds to be registered is unknown, and the two point clouds differ more and the overlapping portion is less in large-scale deformation. Second, unlike images, point clouds have no color features, only spatial coordinates. Due to the reasons, the problems of poor matching, poor registration effect, high algorithm overhead and the like of the conventional non-rigid point cloud registration method in large-scale deformation occur, so that the use of the method in industrial automation is greatly limited.
At present, a non-rigid point cloud registration method based on a traditional method generally uses traditional feature descriptors and the like to calculate and obtain features of an input point cloud, and establishes a reliable point-to-point correspondence relationship by comparing feature similarity degrees of an original point cloud and a target point cloud, so as to obtain a point-to-point rotation translation transformation matrix. However, this type of registration method is prone to generate wrong corresponding relationship point pairs when large deformation is faced, thereby affecting the registration result. In addition, in recent years, a non-rigid Point cloud registration method based on deep learning, such as a Coherent Point Drift Network (CPD-Net) and a cyclic Multi-view registration Network (current Multi-view Alignment Network), is emerging. The overall idea of the method is similar to that of the traditional method, and generally comprises a feature extraction component and a corresponding relation search component. However, the difficulty of network training is increased by the larger degree of freedom in the non-rigid registration, and the training of the network is further limited by the lack of tag data, so that the current non-rigid point cloud registration method based on deep learning is only suitable for small-scale non-rigid deformation. In addition, due to the disorder of the point cloud, the initial deep learning method solves the point cloud registration problem by converting the point cloud into voxels. The method divides the point cloud into a plurality of square grids, but the grid voxel data has great redundancy, so that the method has large calculation amount and high occupied space.
Disclosure of Invention
Based on this, the invention provides a non-rigid three-dimensional point cloud registration method and system based on an attention mechanism, aiming at the technical problem that the target three-dimensional point cloud registration effect under large-scale non-rigid deformation is poor in the prior art.
The invention discloses a non-rigid three-dimensional point cloud registration method based on an attention mechanism, which comprises the following steps of:
s1, using at least one group of self-adaptive example normalization modules as a normalization layer in a standard Transformer network to obtain an improved Transformer network for executing a non-rigid three-dimensional point cloud registration task. The improved transform network further comprises a multi-head cross attention module.
S2, acquiring original point cloud data of a non-rigid target under two different actions, and respectively using the original point cloud data as a source point cloud S and a target point cloud T.
S3, extracting high-dimensional point-by-point characteristics of the source point cloud and the target point cloud, and overlapping the output characteristics with position codes in an improved transform network to obtain initial characteristic embedding X of the source point cloud and the target point cloud S And X T
S4, linear transformation matrix W in multi-head cross attention module is utilized Q 、W K 、W V And performing matrix operation on the initial characteristic embedding of the source point cloud and the target point cloud to obtain result matrixes Q, K and V of linear transformation and calculate attention scores.
And S5, obtaining the intermediate characteristic embedding Z of the source point cloud mapped with the target point cloud information by splicing the plurality of attention score matrixes.
S6, using a self-adaptive example normalization module to perform normalization processing on the intermediate feature embedding Z of the source point cloud, keeping the posture of the source point cloud close to the target point cloud, and meanwhile, not changing original identity information to obtain the normalized feature embedding Z' of the source point cloud.
S7, setting a loss function, taking S4-S6 as a decoder part of an improved transform network, and embedding the initial characteristics of the source point cloud into X s And the normalized feature embedding Z 'is used as the input of the encoder, and the output optimized feature embedding Z' is obtained after the preset times of circulation.
S8, performing convolution and activation on the optimized feature embedding Z' by utilizing a multi-layer perceptron structure, and calculating the coordinates of the result point cloud by combining the following formula to complete point cloud registration. The calculation formula is as follows:
R=2*tanh[Conv(Z″)]
in the formula, R represents the resultant point cloud. tan h is the hyperbolic tangent activation function.
As a further improvement of the above scheme, in S2, the original point cloud data is sampled by using an equivalent down-sampling method, so that the number of points of the source point cloud is the same as that of the target point cloud.
As a further improvement of the above scheme, in S3, a dynamic graph convolution neural network is used to extract high-dimensional point-by-point features of the source point cloud and the target point cloud.
As a further improvement of the above scheme, in S4, the formula for calculating the attention score is:
Figure BDA0004011231950000031
wherein d is k The number of columns of the matrix Q.
As a further improvement of the above scheme, the number of the adaptive instance normalization modules is set to 3 groups.
As a further improvement of the above scheme, in S6, normalization processing is introduced to embed source point cloud features by using residual connection.
As a further improvement of the above scheme, in S6, the expression formula of embedding the normalized features of the source point cloud into Z' is as follows:
Z′=[Conv(AdaIN(Conv(AdaIN(Z))))+Conv(AdaIN(Z))]
in the formula, conv (·) represents convolution processing for ·. AdaIN (-) represents the adaptive instance normalization process for a.
As a further improvement of the above scheme, in S7, a point-by-point grid euclidean distance PMD is used as a loss function:
Figure BDA0004011231950000032
in the formula, M i And M' i Three-dimensional coordinates of the ith point of the target point cloud and the ith point of the result point cloud are respectively represented, i =1,2,3 \8230, wherein \8230, n and n are total point numbers of the target point cloud.
As a further improvement of the above solution, in S2, a laser radar scanner or a Kinect camera is used to capture different motions of the same non-rigid target at different time points.
The invention also discloses a non-rigid three-dimensional point cloud registration system based on the attention mechanism, which adopts any one of the non-rigid three-dimensional point cloud registration methods based on the attention mechanism. The registration system includes: the system comprises a network construction module, a point cloud acquisition module and a point cloud feature extraction module.
And the network construction module is used for using at least one group of adaptive instance normalization modules as a normalization layer in the standard Transformer network to obtain an improved Transformer network for executing a non-rigid three-dimensional point cloud registration task. The improved Transformer network also includes a multi-headed cross attention module.
The point cloud acquisition module is used for acquiring original point cloud data of a non-rigid target under two different actions as a source point cloud S and a target point cloud T respectively.
The point cloud characteristic extraction module is used for extracting high-dimensional point-by-point characteristics of the source point cloud and the target point cloud, and compiling the output characteristics and positions in the improved transform networkCode superposition to obtain initial feature embedding X of source point cloud and target point cloud S And X T
Wherein the linear transformation matrix W in the multi-headed cross attention module Q 、W K 、W V Performing matrix operation on the initial feature embedding of the source point cloud and the target point cloud to obtain result matrixes Q, K and V of linear transformation, calculating attention scores, and obtaining intermediate feature embedding Z of the source point cloud mapped with the target point cloud information by splicing a plurality of attention score matrixes. The self-adaptive instance normalization module is used for performing normalization processing on the intermediate feature embedding Z of the source point cloud, so that the original identity information is not changed while the posture of the source point cloud is kept close to the target point cloud, and the normalization feature embedding Z' of the source point cloud is obtained. The improved Transformer network is also used for embedding the output optimized features obtained by the encoder and the decoder of the improved Transformer network after circulation for preset times, and realizing coordinate calculation of the result point cloud after convolution and activation are carried out on the optimized feature embedding by utilizing a multi-layer perceptron structure to complete point cloud registration.
Compared with the prior art, the technical scheme disclosed by the invention has the following beneficial effects:
according to the invention, the point cloud structure is directly used for registration, so that the real size and shape structure of an original object can be more accurately reflected compared with a traditional voxel-based representation mode, and the registration performance is improved to a certain extent by directly learning set characteristics from the point cloud. Meanwhile, the potential relation between the input source point cloud and the target point cloud is encoded through a multi-head cross attention mechanism, the parts which are considered to be more relevant are endowed with higher weights, the dependency relation between the point clouds is captured, the local structure of the original point cloud is maintained, and the point cloud registration degree under large-scale non-rigid deformation is improved. In addition, the invention also introduces a self-adaptive instance normalization module in the image field, so that the deformed point cloud and the source point cloud are kept consistent on the identity characteristics as much as possible, the surface is smoother, and the registration effect is further improved.
Drawings
Fig. 1 is a flowchart of a non-rigid three-dimensional point cloud registration method based on an attention mechanism in embodiment 1 of the present invention;
fig. 2 is an architecture diagram for registering an input human three-dimensional point cloud in embodiment 1 of the present invention;
FIG. 3 is a comparison diagram of the source point cloud and the target point cloud in the front view direction in embodiment 1 of the present invention;
FIG. 4 is a comparison graph of the result point cloud and the target point cloud in the front view direction in example 1 of the present invention;
FIG. 5 is a comparison graph of the source point cloud and the target point cloud in the side view direction in example 1 of the present invention;
fig. 6 is a comparison graph of the result point cloud and the target point cloud in the side view direction in embodiment 1 of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "or/and" includes any and all combinations of one or more of the associated listed items.
Example 1
The embodiment provides a non-rigid three-dimensional point cloud registration method based on an attention mechanism, which combines the self-adaptive example normalization in the image field and well solves the problem of target three-dimensional point cloud registration under large-scale non-rigid deformation, such as three-dimensional point cloud registration of a human body, and the registration speed and the registration effect are improved to a certain extent.
Referring to fig. 1, the registration method may include the following steps:
s1, using at least one group of self-adaptive example normalization modules as a normalization layer in a standard Transformer network to obtain an improved Transformer network for executing a non-rigid three-dimensional point cloud registration task. The improved transform network further comprises a multi-head cross attention module. In this embodiment, the number of the adaptive instance normalization modules may be set to three groups.
S2, acquiring original point cloud data of a non-rigid target under two different actions, and respectively using the original point cloud data as a source point cloud S and a target point cloud T.
In this embodiment, a LIDAR or microsoft Kinect camera may be used to capture different motions of the same human body at different time points, so as to obtain two original point clouds with about 20000 points. The method comprises the steps of sampling the original point cloud by using an equivalent down-sampling method according to a certain sampling rule, reducing the density of the point cloud under the condition of ensuring that the overall geometric characteristics are not changed, reducing the number of points to 6000, and further reducing the data volume and algorithm complexity of related processing. And the two sampled point clouds are respectively recorded as a source point cloud S and a target point cloud T, and the two point clouds have large non-rigid deformation and low overlapping degree.
S3, extracting high-dimensional point-by-point characteristics of the source point cloud and the target point cloud, and superposing the output characteristics with position codes in an improved transform network to obtain initial characteristics of the source point cloud and the target point cloud, wherein the initial characteristics are embedded into X S And X T
In this embodiment, a Dynamic Graph Convolutional Neural Network (DGCNN) may be used to extract the features of the source point cloud and the target point cloud, where the stacked EdgeConv layers may effectively extract the features of the local shape of the point cloud while maintaining the arrangement invariance. Overlapping the output characteristics with the transform position coding to obtain the global characteristic embedding X S And X T And a topological relation between points is established, and the characteristic characterization capability is further enhanced.
S4, linear transformation matrix W in multi-head cross attention module is utilized Q 、W K 、W V Performing matrix operation on the initial characteristic embedding of the source point cloud and the target point cloud to obtainThe resulting matrix Q, K, V of the linear transformation is used and the attention score is calculated.
Wherein, in the multi-head cross attention module, a linear transformation matrix W is used Q 、W K 、W V Performing matrix operation on the characteristics of the input source point cloud and the target point cloud to obtain result matrixes Q, K and V of linear transformation, and calculating attention scores:
Figure BDA0004011231950000061
in the formula (d) k The number of columns of the matrix Q.
And S5, obtaining the intermediate characteristic embedding Z of the source point cloud mapped with the target point cloud information by splicing the plurality of attention score matrixes. Among other things, the multi-headed attention helps the network to capture richer information.
S6, using a self-adaptive example normalization module to normalize the intermediate feature embedding Z of the source point cloud, keeping the posture of the source point cloud close to the target point cloud without changing original identity information, and simultaneously introducing and adopting residual connection to obtain the normalized feature embedding Z' of the source point cloud.
In this embodiment, adaptive Instance Normalization (AdaIN) in the image style migration is introduced into the Normalization layer to make the pose of the normalized Instance approach the target point cloud without changing the original identity feature information, which further accelerates the convergence of the network. The residual layer uses residual connection to enable the network to be concentrated on the current difference part, and the problems of gradient disappearance and network degradation during multi-layer network training are solved. The expression formula of embedding normalized features of the source point cloud into Z' is as follows:
Z′=[Conv(AdaIN(Conv(AdaIN(z))))+Conv(AdaIN(z))]
in the formula, conv (·) represents convolution processing for ·. AdaIN (-) represents the adaptive instance normalization process for a.
S7, setting a loss function, taking S4-S6 as a decoder part of an improved transform network, and embedding the initial characteristics of the source point cloud into X S And normalized feature embedded Z'And the optimized characteristic embedding Z' is used as the input of the encoder, and the output is obtained after the preset number of cycles.
S8, embedding the optimized features into Z' by using a multi-layer perceptron (MLP) structure to carry out convolution, activation and other processing, and obtaining the deformed human body point cloud. Point-wise Mesh Euclidean Distance (PMD) was used as a loss function in this training process:
Figure BDA0004011231950000071
in the formula, M i And M' i Three-dimensional coordinates of the ith point of the target point cloud and the result point cloud are respectively represented, i =1,2,3 \8230, n is the total point number of the target point cloud.
And S8, calculating the coordinates of the result point cloud by combining the following formula, and finishing point cloud registration. The calculation formula is as follows:
R=2*tanh[Conv(Z″)]
in the formula, R represents the resultant point cloud. tan h is the hyperbolic tangent activation function.
In order to verify the registration performance of the non-rigid three-dimensional point cloud registration method based on the attention mechanism, the embodiment further provides a set of qualitative experiments, and the experimental environment is as follows: intel (R) Xeon (R) CPU E5-2609 v4@1.70GHz, ubuntu 18.04 operating system, 32G memory, nvidia TITAN X display card, programming environment is Pycharm, deep learning frame is Pytrch 1.5.0, and used data sets comprise MPI FAUST, MPI Dynamic FAUST and the like.
Please refer to fig. 3 to 6, wherein, in the front view direction, fig. 3 is a comparison diagram of a source point cloud and a target point cloud, and fig. 4 is a comparison diagram of a result point cloud and a target point cloud. In the side view direction, fig. 5 is a comparison diagram of the source point cloud and the target point cloud, and fig. 6 is a comparison diagram of the result point cloud and the target point cloud. The method has good registration effect on the large-scale deformation point cloud, and the result point cloud is very attached to the target point cloud at the elbow and knee joint, and the fitting effect of the extremities such as hands and feet is also ideal.
According to the invention, the point cloud structure is directly used for registration, so that the real size and shape structure of an original object can be more accurately reflected compared with a traditional voxel-based representation mode, and the registration performance is improved to a certain extent by directly learning set characteristics from the point cloud. Meanwhile, the potential relation between the input source point cloud and the target point cloud is coded through a multi-head cross attention mechanism, the parts considered to be more relevant are endowed with higher weight, the dependency relation between the point clouds is captured, the local structure of the original point cloud is maintained, and the point cloud registration degree under large-scale non-rigid deformation is improved. In addition, the invention also introduces a self-adaptive instance normalization module in the image field, so that the deformed point cloud and the source point cloud are kept consistent on the identity characteristics as much as possible, the surface is smoother, and the registration effect is further improved.
Example 2
The embodiment provides a non-rigid three-dimensional point cloud registration system based on an attention mechanism, which adopts the non-rigid three-dimensional point cloud registration method based on the attention mechanism in embodiment 1. The registration system includes: the system comprises a network construction module, a point cloud acquisition module and a point cloud feature extraction module.
And the network construction module is used for using at least one group of adaptive instance normalization modules as a normalization layer in the standard Transformer network to obtain an improved Transformer network for executing a non-rigid three-dimensional point cloud registration task. The improved transform network further comprises a multi-head cross attention module.
The point cloud obtaining module is used for obtaining original point cloud data of a non-rigid target under two different actions, and the original point cloud data are respectively used as a source point cloud S and a target point cloud T.
The point cloud feature extraction module is used for extracting high-dimensional point-by-point features of the source point cloud and the target point cloud, overlapping the output features with position codes in the improved transform network to obtain initial feature embedding X of the source point cloud and the target point cloud S And X T
Wherein the linear transformation matrix W in the multi-headed cross attention module Q 、W K 、W V For source and target point cloudsAnd performing matrix operation on the initial characteristic embedding to obtain result matrixes Q, K and V of linear transformation, calculating attention scores, and splicing a plurality of attention score matrixes to obtain intermediate characteristic embedding Z of the source point cloud mapped with the target point cloud information. The self-adaptive instance normalization module is used for performing normalization processing on the intermediate feature embedding Z of the source point cloud, so that the original identity information is not changed while the posture of the source point cloud is kept close to the target point cloud, and the normalization feature embedding Z' of the source point cloud is obtained. The improved Transformer network is also used for embedding the output optimized features obtained by the encoder and the decoder of the improved Transformer network after the preset times of circulation, and realizing coordinate calculation of the result point cloud after the optimized feature embedding is convoluted and activated by utilizing a multi-layer perceptron structure, so as to complete point cloud registration.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples are merely illustrative of several embodiments of the present invention, and the description thereof is more specific and detailed, but not to be construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the appended claims.

Claims (10)

1. A non-rigid three-dimensional point cloud registration method based on an attention mechanism is characterized by comprising the following steps:
s1, using at least one group of self-adaptive instance normalization modules as a normalization layer in a standard Transformer network to obtain an improved Transformer network for executing a non-rigid three-dimensional point cloud registration task; the improved Transformer network further comprises a multi-head cross attention module;
s2, acquiring original point cloud data of a non-rigid target under two different actions as a source point cloud S and a target point cloud T respectively;
s3, extracting high-dimensional point-by-point characteristics of the source point cloud and the target point cloud, and overlapping the output characteristics with position codes in the improved transform network to obtain initial characteristic embedding X of the source point cloud and the target point cloud S And X T
S4, utilizing a linear transformation matrix W in the multi-head cross attention module Q 、W K 、W V Performing matrix operation on the initial characteristic embedding of the source point cloud and the target point cloud to obtain result matrixes Q, K and V of linear transformation and calculate attention scores;
s5, obtaining intermediate characteristic embedding Z of the source point cloud mapped with the target point cloud information by splicing the plurality of attention score matrixes;
s6, performing normalization processing on the intermediate feature embedding Z of the source point cloud by using the self-adaptive instance normalization module, keeping the posture of the source point cloud close to the target point cloud, and meanwhile, not changing original identity information to obtain a normalized feature embedding Z' of the source point cloud;
s7, setting a loss function, taking S4-S6 as a decoder part of the improved Transformer network, and embedding the initial characteristics of the source point cloud into X s The normalized feature embedding Z 'is used as the input of the encoder, and the output optimized feature embedding Z' is obtained after the preset times of circulation;
s8, performing convolution and activation on the optimized feature embedding Z' by utilizing a multi-layer perceptron structure, and calculating the coordinates of the result point cloud by combining the following formula to complete point cloud registration; the calculation formula is as follows:
R=2*tanh[Conv(Z″)]
wherein R represents the resulting point cloud; tan h is the hyperbolic tangent activation function.
2. The attention mechanism-based non-rigid three-dimensional point cloud registration method of claim 1, wherein in S2, the original point cloud data is sampled using an equal-amount down-sampling method so that the number of points of the source point cloud and the target point cloud is the same.
3. The attention-based non-rigid three-dimensional point cloud registration method according to claim 1, wherein in S3, a dynamic graph convolution neural network is used to extract high-dimensional point-by-point features of the source point cloud and the target point cloud.
4. The attention mechanism-based non-rigid three-dimensional point cloud registration method according to claim 1, wherein in S4, the attention score is calculated by the following formula:
Figure FDA0004011231940000021
wherein d is k The number of columns of the matrix Q.
5. The attention-based non-rigid three-dimensional point cloud registration method of claim 1, wherein the number of adaptive instance normalization modules is set to 3 groups.
6. The attention mechanism-based non-rigid three-dimensional point cloud registration method according to claim 5, wherein in S6, normalization processing for embedding of the cloud features of the source points by using residual connection is introduced.
7. The non-rigid three-dimensional point cloud registration method based on attention mechanism as claimed in claim 6, wherein in S6, the expression formula of the normalized feature embedding Z' of the source point cloud is:
Z'=[Conv(AdaIN(Conv(AdaIW(Z))))+Conv(AdaIW(Z))]
in the formula, conv (·) represents convolution processing for ·; adaIN (-) represents the adaptive instance normalization process for a.
8. The attention-based non-rigid three-dimensional point cloud registration method of claim 1, wherein in S7, a point-by-point grid euclidean distance PMD is used as a penalty function:
Figure FDA0004011231940000022
in the formula, M i And M' i Three-dimensional coordinates of ith points of the target point cloud and the result point cloud are respectively represented, i =1,2,3 \8230, n is the total point number of the target point cloud.
9. The attention mechanism-based non-rigid three-dimensional point cloud registration method according to claim 1, wherein in S2, a lidar scanner or a Kinect camera is used to capture different motions of the same non-rigid target at different time points.
10. A non-rigid three-dimensional point cloud registration system based on attention mechanism, which is characterized in that the non-rigid three-dimensional point cloud registration method based on attention mechanism of any one of claims 1 to 9 is adopted; the registration system includes:
the network construction module is used for using at least one group of self-adaptive example normalization modules as a normalization layer in a standard Transformer network to obtain an improved Transformer network for executing a non-rigid three-dimensional point cloud registration task; the improved Transformer network further comprises a multi-head cross attention module;
the system comprises a point cloud acquisition module, a point cloud acquisition module and a control module, wherein the point cloud acquisition module is used for acquiring original point cloud data of a non-rigid target under two different actions as a source point cloud S and a target point cloud T respectively; and
a point cloud feature extraction module for extracting high-dimensional point-by-point features of the source point cloud and the target point cloud, and overlapping the output features with the position codes in the improved transform network to obtain initial feature embedding X of the source point cloud and the target point cloud S And X T
Wherein the multiple head cross attentionLinear transformation matrix W in force module Q 、W K 、W V Performing matrix operation on the initial feature embedding of the source point cloud and the target point cloud to obtain result matrixes Q, K and V of linear transformation, calculating attention scores, and obtaining intermediate feature embedding Z of the source point cloud mapped with target point cloud information by splicing a plurality of attention score matrixes; the self-adaptive instance normalization module is used for performing normalization processing on the intermediate feature embedding Z of the source point cloud, so that original identity information is not changed while the posture of the source point cloud is kept close to the target point cloud, and the normalization feature embedding Z' of the source point cloud is obtained; the improved Transformer network is also used for embedding the output optimized features obtained by the encoder and the decoder of the improved Transformer network after the preset times of circulation, and realizing coordinate calculation of the result point cloud after convolution and activation are carried out on the optimized feature embedding by utilizing a multi-layer perceptron structure, so as to finish point cloud registration.
CN202211652660.9A 2022-12-21 2022-12-21 Non-rigid three-dimensional point cloud registration method and system based on attention mechanism Pending CN115731275A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211652660.9A CN115731275A (en) 2022-12-21 2022-12-21 Non-rigid three-dimensional point cloud registration method and system based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211652660.9A CN115731275A (en) 2022-12-21 2022-12-21 Non-rigid three-dimensional point cloud registration method and system based on attention mechanism

Publications (1)

Publication Number Publication Date
CN115731275A true CN115731275A (en) 2023-03-03

Family

ID=85301643

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211652660.9A Pending CN115731275A (en) 2022-12-21 2022-12-21 Non-rigid three-dimensional point cloud registration method and system based on attention mechanism

Country Status (1)

Country Link
CN (1) CN115731275A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117291845A (en) * 2023-11-27 2023-12-26 成都理工大学 Point cloud ground filtering method, system, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117291845A (en) * 2023-11-27 2023-12-26 成都理工大学 Point cloud ground filtering method, system, electronic equipment and storage medium
CN117291845B (en) * 2023-11-27 2024-03-19 成都理工大学 Point cloud ground filtering method, system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
Taylor et al. The vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation
Giese et al. Morphable models for the analysis and synthesis of complex motion patterns
Cohen et al. Inference of human postures by classification of 3D human body shape
Sanchez-Riera et al. Robust RGB-D hand tracking using deep learning priors
Kong et al. Head pose estimation from a 2D face image using 3D face morphing with depth parameters
CN114026599A (en) Reconstructing a three-dimensional scene from two-dimensional images
Lei et al. Cadex: Learning canonical deformation coordinate space for dynamic surface representation via neural homeomorphism
CN113034652A (en) Virtual image driving method, device, equipment and storage medium
Liu et al. Facial expression recognition using pose-guided face alignment and discriminative features based on deep learning
Liang et al. Hough forest with optimized leaves for global hand pose estimation with arbitrary postures
Wunsch et al. Real-Time pose estimation of 3D objects from camera images using neural networks
CN112528902B (en) Video monitoring dynamic face recognition method and device based on 3D face model
Geng et al. Combining features for chinese sign language recognition with kinect
CN115171149A (en) Monocular RGB image regression-based real-time human body 2D/3D bone key point identification method
CN115731275A (en) Non-rigid three-dimensional point cloud registration method and system based on attention mechanism
CN112801945A (en) Depth Gaussian mixture model skull registration method based on dual attention mechanism feature extraction
CN112906520A (en) Gesture coding-based action recognition method and device
Lan et al. The application of 3D morphable model (3DMM) for real-time visualization of acupoints on a smartphone
Liu et al. Bdr6d: Bidirectional deep residual fusion network for 6d pose estimation
Wu et al. Cross-regional attention network for point cloud completion
Li et al. Quantized self-supervised local feature for real-time robot indirect VSLAM
Azad et al. Accurate shape-based 6-dof pose estimation of single-colored objects
Peng et al. View-invariant full-body gesture recognition from video
Lin 3D object detection and 6D pose estimation using RGB-D images and mask R-CNN
Peng et al. View-invariant full-body gesture recognition via multilinear analysis of voxel data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination