CN117853748A - Multi-axis transducer point cloud registration method, device and medium based on local importance priori - Google Patents

Multi-axis transducer point cloud registration method, device and medium based on local importance priori Download PDF

Info

Publication number
CN117853748A
CN117853748A CN202410075232.7A CN202410075232A CN117853748A CN 117853748 A CN117853748 A CN 117853748A CN 202410075232 A CN202410075232 A CN 202410075232A CN 117853748 A CN117853748 A CN 117853748A
Authority
CN
China
Prior art keywords
point
point cloud
local
super
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410075232.7A
Other languages
Chinese (zh)
Inventor
田彦
王昊
许嘉辉
薛鹏程
沈晨辉
蔡文杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Gongshang University
Original Assignee
Zhejiang Gongshang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Gongshang University filed Critical Zhejiang Gongshang University
Priority to CN202410075232.7A priority Critical patent/CN117853748A/en
Publication of CN117853748A publication Critical patent/CN117853748A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Processing (AREA)

Abstract

The invention discloses a multi-axis transducer point cloud registration method, equipment and medium based on local importance priori. The method can improve the training efficiency of the model and enhance the perceptibility of the model to the point cloud details and the whole geometric information, thereby improving the geometric identification degree of the extracted point cloud characteristics and finally improving the accuracy of point cloud registration. The invention highlights the registered effective area by designing the window weight priori module, and perceives local and global geometric information in a divide-and-conquer mode by utilizing the multi-axis transducer. Compared with other advanced methods, the method can extract the point cloud characteristics with identification degree, can cope with the point cloud registration problem in the low-overlap scene, and has considerable competitiveness.

Description

Multi-axis transducer point cloud registration method, device and medium based on local importance priori
Technical Field
The invention relates to a point cloud registration technology, in particular to a multi-axis transducer point cloud registration method, equipment and medium based on local importance priori.
Background
Point cloud registration is a fundamental problem for three-dimensional computer vision and photogrammetry. The background is to find a transformation given several sets of point clouds in different coordinate systems such that all point clouds are aligned into a common coordinate system. Such transformations may help us better understand, compare and analyze different point cloud data. In many vision applications, point cloud registration plays an important role, such as three-dimensional reconstruction, object tracking, pose estimation, orthodontic monitoring, etc. In deep registration schemes based on coarse to fine blueprints, transgenes are often used for context information interaction at the superpoint level due to their own long-order modeling capabilities. However, conventional transformers suffer from two problems: firstly, when the transform performs feature extraction which is unchanged through transformation on the super-point level of the point cloud sparsification operation, as a multi-scale sampling process does not exist, global feature coding is only performed on a full-scale input sequence, local detail information is ignored to a certain extent, and the registration difficulty in a low-overlapping-rate scene is increased. Secondly, the attention mechanism needs to implicitly sense the overlapped area and the fuzzy overlapped boundary while encoding the point cloud characteristics, but the implicit method for sensing the local geometrical consistent area cannot purposefully model the registered effective area and has a certain fuzzy characteristic problem. The above two problems affect the registration accuracy of the point cloud.
Disclosure of Invention
The invention aims to provide a multi-axis transducer point cloud registration method, equipment and medium based on local importance priori aiming at the defects of the prior art.
The aim of the invention is realized by the following technical scheme:
according to a first aspect of the present specification, there is provided a multi-axis transducer point cloud registration method based on local importance priors, the method comprising the steps of:
(1) Taking a KPConv-FPN-based network as a backbone network, inputting a source point cloud p of an indoor scene into the backbone network, and outputting a super point sequenceAnd the corresponding coarse features->Wherein N is the number of the super points, C is the feature dimension of the super points, and the calculation modes of the target point cloud q are the same.
(2) The super point feature obtained in the step (1) is processedAnd (5) inputting a window weight priori module. The super point is divided into non-overlapping areas with fixed window size w according to the physical position of the super point to obtain a local patch set +.>The patch not filled with window size w will be subjected to a fill operation resulting in an updated fill patch set +.> The local geometric information is then sent into a local weight encoder as an input feature set to aggregate the local geometric information to obtain a multi-window weight set omega, and the multi-window weight set omega and the input feature set +.>After multiplication in window dimension, the residual is used as additional embedded information and input feature set +.>Fusion to obtain local importance priori knowledge +.>The calculation mode of the target point cloud q is the same.
(3) The patch set filled in the step (2) is collectedAnd (3) masking operation is carried out to remove the interference of the manual added information on the characteristics, and the calculation modes of the target point cloud q are the same.
(4) Super-point characteristics of source point cloud p fused with local importance priori knowledge in step (3)Performing filling operation again to obtain filling sequence +.>In the block attention branch of the multiaxial transducer, the N' sequences of superpoints are divided into a series of consecutive non-overlapping partial sequence groups according to the window size w, the vector dimensions of which are expressed as: (N '/w, w, C/2) with a geometric self-attention computation applied to each local sequence group for local feature encoding, meshing the input N' super-point sequences in parallel grid attention branches, and extracting the super-point features of the corresponding positions in parallel in a plurality of consecutive windows to obtain a sparse global sequence group, the vector dimension of which is expressed as: (w, N'/w, C/2), then the geometric self-attention computation is applied to the global sequence group for global feature encoding, the computation of the target point cloud q being the same; the manually added superpoint feature sequence is then removed.
(5) After the local and global geometric features of the source point cloud calculated in the step (4) are spliced in the channel dimension, inputting the spliced local and global geometric features into a feature fusion module to fuse geometric information of the two different scales, and obtaining geometric features with identificationThe target point cloud q is calculated in the same way, its encoded features are denoted +.>
(6) Geometric features of source point cloud and target point cloud with different scales are fusedAnd->The geometric continuity of two point clouds is modeled by being jointly fed into a cross attention module, and the characteristic +.>And->Iteratively executing the steps (4) - (6) three times to obtain the geometrically significant super-point feature ++>And->
(7) The super point feature output in the step (6) is processedAnd->Input super-point matching module using gaussian kernel functionCalculating a super-point covariance matrix S of the source point cloud and the target point cloud, wherein gamma is a constant,the squared Euclidean distance between representative features is calculated by using bi-directional normalization on covariance matrix SPoint similarity matrix->
(8) And (3) obtaining a dense point-to-node relationship by using a point-to-node aggregation strategy according to the super point similarity matrix obtained in the step (7) by using a dense point-to-node construction module.
(9) And (3) calculating rigid body changes between the point cloud pairs by using a RANSAC algorithm according to the dense point pair relation obtained in the step (8).
Further, the local saliency priori knowledge and the perceived overlap boundary are provided in step (2) by a window weight prior module. The method comprises the following steps: at the position ofIn group i feature->Is fed into a local weight encoder to obtain encoded featuresSubsequently, all local features are aggregated in a near-additive manner, resulting in an aggregate feature set +.>
Wherein X is p As a set of local features,features that are the jth point within the ith window;
SoftMax is then used to normalize the local features to obtain the multi-window weight set ω. The multi-window weight set omega and the input feature setAfter multiplication in window dimension, the residual is used as additional embedded information and input feature set +.>Fusion, thus obtaining the local importance priori knowledge +.>The specific formula is as follows:
where Relu (Rectified Linear Unit) is the nonlinear active layer and LN represents the linear mapping.
Further, in the step (4), a two-axis parallel transducer module is designed, and the module respectively transmits the attention calculation of the full scale to a block space axis and a grid space axis through decomposition, has the feature expression capability of sensing local dense and sparse global, and has linear complexity, specifically: input feature Z p Dividing into two branches in the channel dimension, wherein a block branch divides a feature into consecutive partial sequence groups according to a window size wGrid branching extracts super-point features of corresponding positions in parallel from continuous feature windows to obtain sparse global sequence group +.>Then on these two branches, the dense local and sparse global features are encoded using geometric self-attention, respectively, and then the encoded features are stitched again in the channel dimension as follows:
where CAT denotes feature concatenation in channel dimension, blockAtten denotes computation of geometric self-attention in block branches, gridAtten denotes computation of geometric self-attention in grid branches, MLP (Multi-Layer Perceptron) denotes a Multi-Layer Perceptron, normal denotes feature normalization operation, move denotes removal of artificially added superpoint feature sequences.
Further, in step (4), the geometric expression of the geometric autonomy calculation is represented as the Euclidean distance D between any two superpoints a and b a,b Angle of normal vectorAnd a bi-directional spatial angular distance BSD a,b The formula is as follows:
D a,b =‖a-b‖
wherein,and->Representing normal vectors on the points respectively; the calculation mode of the target point cloud q is the same;
geometric embedding of geometric autonomy calculations is represented as
Further, in the step (5), the features of different scales of the local and global of the source point cloud p are fused through a feature fusion module; the feature fusion module is global geometric self-attention, and the calculation process is as follows:
wherein,respectively representing geometrical characteristics of the i and j th super-points after splicing, < + >>Representing geometrical characteristics of the ith super point after characteristic fusion; w (W) Q ,W K ,W V Is Z p ' mapping matrix, PE i,j The geometric embedding of the ith and j-th super points; c (C) t Representing Z p ' feature dimension; the calculation mode of the target point cloud q is the same.
Further, in step (8), the construction of the dense point-to-point relationship is achieved through a point-to-node aggregation policy. Specifically, the K-nearest neighbor algorithm is used to allocate 128 dense points output by the backbone network decoder to each super point, and for the i-th group of local dense point pairs of the source point cloud p and the target point cloud q, multiply the feature matrix thereof to obtain a correlation matrixConsidering that not all points have correspondence, a learnable vector B is added to the correlation matrix>The last row and the last column get the expansion matrix +.>Then in the expansion matrix->The use of SinkhThe orn algorithm iterates 100 times to obtain local point pair matching scores, the added vector B is removed, 128 bidirectional maximum point pairs are selected to obtain local point pair relations, and the obtained local point pair sets are combined to obtain global point pair relations.
Further, training a point cloud registration model formed by a backbone network, a super point information interaction module, a super point matching module and a dense point pair construction module, wherein the super point information interaction module consists of a window weight priori module and a multi-axis transducer, and in the training stage, training is performed on a 3D (digital scenic spot) cloud database by using an Adam optimization algorithm, wherein the initial learning rate is 0.0001, the learning rate attenuation is 0.95, the weight attenuation is 0.000001, and the learning rate attenuation step length is 1; when the point cloud is input into a point cloud registration model, initializing the size of voxels to be 2.5cm, wherein the batch size in an experiment is 1, and performing data enhancement including Gaussian noise, random rotation translation disturbance and point cloud scale scaling; the training period on 3DMatch is 40, the window size is set to 4 in the window weight a priori block, and the window size is set to 4 in the two-axis parallel transform block, which is performed 3 times in succession.
According to a second aspect of the present specification, there is provided a multi-axis transducer point cloud registration apparatus based on local importance priors, comprising: a memory, a processor, and a computer program; wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method according to the first aspect.
According to a third aspect of the present description, there is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the method of the first aspect.
Compared with the prior art, the invention has the following beneficial effects:
1. a window weight priori module is designed, and the module reduces the concern of the model on registration irrelevant and fuzzy areas and improves training efficiency.
2. The multi-axis parallel transducer module is designed, and the module extracts local and global features with geometric discrimination in a divide-and-conquer mode, so that the expression of the geometric features of the point cloud is enhanced, and the registration accuracy is improved.
3. Geometric coding based on surface normal vectors is designed, and geometric perception of the model to point clouds is enhanced.
4. Experimental results on the public databases 3Dmatch and 3DLoMatch show that compared with other advanced methods, the method can extract the point cloud characteristics with enough discrimination and has higher registration accuracy.
Drawings
Fig. 1 is a schematic diagram of an overall framework of a point cloud registration method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the construction of a window weight prior module according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a point cloud registration method data set according to an embodiment of the present invention;
fig. 4 is a schematic of the output of an embodiment of the present invention on the 3Dmatch and 3 dloatch datasets.
Detailed Description
In order that the above objects, features and advantages of the invention will be readily understood, a more particular description of the invention will be rendered by reference to the appended drawings.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
The embodiment of the invention provides a multi-axis transducer point cloud registration method based on local importance priori, which comprises the following steps as shown in fig. 1:
1. backbone network extraction point cloud initial coarse features
The input is an indoor scene source point cloud p and a target point cloud q, wherein the input indoor scene source point cloud p is taken as an illustration of a network processing point cloud flow, and the processing process of the target point cloud q is consistent with the source point cloud p. Inputting the source point cloud p into a KPConv-FPN backbone network sharing weights, and outputtingSuper point sequenceAnd the corresponding coarse features->Wherein N is the number of the super points, C is the feature dimension of the super points, and the calculation modes of the target point cloud q are the same.
2. Highlighting registration effective area
The coarse features obtained in the step 1 are processedThe input window weight priori module, as shown in figure 2, the super point is divided into non-overlapping regions with fixed window size w according to the physical position of the super point to obtain local patch set +.>The patch not filled with window size w will be subjected to a fill operation resulting in an updated fill patch set +.> Then the partial geometric information is sent into a partial weight encoder to be aggregated to obtain a multi-window weight set omega, and the multi-window weight set omega and the input feature set +.>After multiplication in window dimension, the residual is used as additional embedded information and input feature set +.>Fusion to obtain local importance priori knowledge +.>The calculation mode of the target point cloud q is the same; then mask the filled patch setThe code operation, i.e. the removal of the manually filled information.
Specifically, inIn group i feature->Is fed into a local weight encoder to obtain encoded local featuresSubsequently, all local features are aggregated in a near-additive manner, resulting in an aggregate feature set +.>
SoftMax is then used to normalize the local features to obtain a multi-window weight set ω, which is combined with the input feature setAfter multiplication in window dimension, the residual is used as additional embedded information and input feature set +.>Fusion, thus obtaining the local importance priori knowledge +.>The specific formula is as follows:
where LN represents a linear mapping and Relu is the activation function.
3. Extracting local and global geometric features
Super-point characteristics of source point cloud p fused with local importance priori knowledge in step 2Performing filling operation again to obtain a filling sequence group +.>In the block attention branch of the multiaxial transducer, the N' sequences of superpoints are divided into a series of consecutive non-overlapping partial sequence groups according to the window size w, the vector dimensions of which are expressed as: (N '/w, w, C/2) with a geometric self-attention computation applied to each local sequence group for local feature encoding, meshing the input N' super-point sequences in parallel grid attention branches, and extracting the super-point features of the corresponding positions in parallel in a plurality of consecutive windows to obtain a sparse global sequence group, the vector dimensions of which are expressed as: (w, N'/w, C/2), then the geometric self-attention computation is applied to the global sequence group for global feature encoding, and the computation mode of the target point cloud q is the same; the manually added superpoint feature sequence is then removed.
Specifically, a two-axis parallel transducer module is designed, which has the capability of encoding local dense and sparse global features and has linear complexity, specifically, input features Z, by decomposing full-scale attention calculations onto the block space axis and grid space axis, respectively p Dividing into two branches in the channel dimension, wherein a block branch divides a feature into consecutive non-overlapping partial sequence groups according to a window size wGrid branching extracts super-point features of corresponding positions in parallel from continuous feature windows to obtain sparse global sequence group +.>Then on these two branches, dense local and sparse full are encoded using geometric self-attention, respectivelyThe local features, then the encoded features are stitched again in the channel dimension, as specifically formulated below:
wherein CAT represents feature stitching in channel dimension, blockAtten refers to geometric self-attention computation in block branches, gridAtten refers to geometric self-attention computation in grid branches, MLP represents multi-layer perceptron, normal refers to feature normalization operation, and Move refers to removing artificially added super-point feature sequence.
4. Fusing local and global geometric features
Splicing the local and global geometric features of the source point cloud calculated in the step 3 on the channel dimension to obtain a new feature Z p The feature is then fed into a feature fusion module, resulting in a geometric feature with identityThe calculation modes of the target point cloud q are the same, and the geometric feature +.>The feature fusion module is global geometric self-attention, and the calculation process is as follows:
wherein,respectively representing geometrical characteristics of the i and j th super-points after splicing, < + >>Representing geometrical characteristics of the ith super point after characteristic fusion; w (W) Q ,W K ,W V Is Z p ' mapping matrix, PE i,j The geometric embedding of the ith and j-th super points; c (C) t Representing Z p ' feature dimension; the calculation mode of the target point cloud q is the same.
5. Modeling geometric continuity of point cloud pairs
Geometric features of source point cloud and target point cloud with different scales are fusedAnd->Modeling geometric continuity of two point clouds in a cross attention module to obtain characteristic +.>And->Iteratively executing the steps 3 to 5 for three times to obtain the geometrically significant super-point feature ++>And->
6. Super point matching
Based on the characteristics of the interaction of the two point clouds in the step 5, a Gaussian kernel function is usedCalculating a super-point covariance matrix S of the source point cloud and the target point cloud, wherein gamma is a constant,/I>Representing squared Euclidean distances between features by co-ordinatingAfter using the bi-directional normalization on the variance matrix S, the super-point similarity matrix can be calculated>
7. Dense point-to-point relationship construction
And (3) obtaining a dense point-to-point relation by utilizing a point-to-node aggregation strategy according to the super-point similarity matrix obtained in the step (6).
Specifically, the K nearest neighbor algorithm is used for distributing K dense points output by a backbone network decoder to each super point, and multiplying the characteristic matrix of the i-th group of local dense point pairs of the point clouds p and q to obtain a correlation matrixConsidering that not all points have correspondence, a learnable vector B is added to the correlation matrix +.>The last row and the last column get the expansion matrix +.>Then in the expansion matrix->And iterating 100 times by using a sink horn algorithm to obtain a local point pair matching score, removing the added vector B, selecting M bidirectional maximum score point pairs to obtain a local point pair relation, collecting and merging the obtained local point pairs to obtain a global point pair relation.
8. Point cloud pair change prediction
And (3) calculating the rigid body change between the point cloud pairs by using a RANSAC algorithm according to the global point pair relation obtained in the step (7).
Fig. 3 shows two sets of point cloud data of indoor scene data sets 3DMatch and 3DLomatch, yellow as source point cloud and blue as target point cloud. The left image is two groups of indoor scene point cloud data of 3DMatch, the overlapping rate is more than 30%, the right image is two groups of indoor scene point cloud data of 3DLomatch, and the overlapping rate is between 10% and 30%.
Fig. 4 shows three sets of registration results for a 3 dloatch indoor scene point cloud. The first column is the input indoor scene source point cloud and target point cloud, represented as blue and yellow, respectively. The second column is the registration result of the existing geotranformer, and the third column is the registration result of the present invention. The last column is the registration truth. Compared with other advanced methods, the method can extract the point cloud characteristics with enough identification degree, and has higher registration accuracy.
In one embodiment, a multi-axis transducer point cloud registration device based on local importance priors is presented, comprising a memory, a processor, and a computer program; wherein the computer program is stored in the memory and configured to be executed by the processor to implement the above-described multi-axis transducer point cloud registration method based on local importance priors.
In one embodiment, a storage medium storing computer readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps in the multi-axis transducer point cloud registration method based on local importance priors in the above embodiments is presented. Wherein the storage medium may be a non-volatile storage medium.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The foregoing description of the preferred embodiment(s) is (are) merely intended to illustrate the embodiment(s) of the present invention, and it is not intended to limit the embodiment(s) of the present invention to the particular embodiment(s) described.

Claims (9)

1. The multi-axis transducer point cloud registration method based on the local importance prior is characterized by being used for registering indoor scene point clouds to obtain complete indoor scene point clouds, and comprises the following steps of:
(1) Taking a KPConv-FPN network as a backbone network, inputting a source point cloud p of an indoor scene into the backbone network, and outputting a super point sequenceAnd the corresponding coarse features->Wherein N is the number of the super points, C is the feature dimension of the super points, and the calculation modes of the target point cloud q of the indoor scene are the same;
(2) The coarse features obtained in the step (1) are processedAn input window weight priori module; the super point is divided into non-overlapping areas with fixed window size w according to the physical position of the super point to obtain a local patch set +.>The patch not filled with window size w will be subjected to a fill operation resulting in an updated fill patch set +.>The local geometric information is then sent into a local weight encoder as an input feature set to aggregate the local geometric information to obtain a multi-window weight set omega, and the multi-window weight set omega and the input feature set +.>After multiplication in window dimension, the residual is used as additional embedded information and input feature set +.>Fusion to obtain local importance priori knowledge +.>The calculation mode of the target point cloud q is the same;
(3) Filling the patch set in the step (2)Masking operation is carried out to remove interference of manual added information on the characteristics, and the calculation modes of the target point cloud q are the same;
(4) Super-point characteristics of source point cloud p fused with local importance priori knowledge in step (3)Performing filling operation again to obtain filling sequence +.>In the block attention branch of the multi-axis transducer, N 'super-point sequences are divided into a series of continuous non-overlapping local sequence groups according to the window size w, then geometric self-attention calculation is applied to each local sequence group for local feature coding, N' super-point sequences are input in a gridding mode in parallel grid attention branches, super-point features at corresponding positions are extracted in parallel in a plurality of continuous windows to obtain a sparse global sequence group, then geometric self-attention calculation is applied to the global sequence group for global feature coding, and the calculation mode of target point cloud q is the same; then removing the artificially added super-point feature sequence;
(5) After the local and global geometric features of the source point cloud calculated in the step (4) are spliced in the channel dimension, inputting the spliced local and global geometric features into a feature fusion module to fuse geometric information of the two different scales, and obtaining geometric features with identificationThe calculation modes of the target point cloud q are the same, and the geometric feature +.>
(6) Geometrical characteristics of the source point cloud and the target point cloud obtained through fusionAnd->The geometric continuity of two point clouds is modeled by being jointly fed into a cross attention module, and the characteristic +.>And->Iteratively performing steps (4) - (6) several times to obtain more pronounced super-point features ++>And->
(7) The super point feature output in the step (6) is processedAnd->Inputting a super-point matching module, calculating a super-point covariance matrix S of a source point cloud and a target point cloud by using a Gaussian kernel function, and calculating a super-point similarity matrix S by using bidirectional standardization on the covariance matrix S>
(8) Adopting a dense point pair construction module, and obtaining a dense point pair relation by utilizing a point-to-node aggregation strategy according to the super point similarity matrix obtained in the step (7);
(9) And (3) calculating rigid body changes between the point cloud pairs by using a RANSAC algorithm according to the dense point pair relation obtained in the step (8).
2. The method according to claim 1, wherein the local saliency a priori knowledge and the perceived overlap boundary are provided in step (2) by a window weight a priori module, in particular: at the position ofIn group i feature->Is fed into a local weight encoder to obtain the encoded local feature +.>Subsequently, all local features are aggregated in a near-additive manner, resulting in an aggregate feature set +.>
Wherein X is p As a set of local features,features that are the jth point within the ith window;
SoftMax is then used to normalize the local features to obtain a multi-window weight set ω, which is combined with the input feature setAfter multiplication in window dimension, the residual is used as additional embedded information and input feature set +.>Fusion, thus obtaining the local importance priori knowledge +.>The specific formula is as follows:
where LN represents a linear mapping and Relu is the activation function.
3. The method of claim 1, wherein in step (4) a two-axis parallel transducer module is designed, which has the ability to encode locally dense and sparse global features and has linear complexity by decomposing the full-scale attention calculations onto the block space axis and grid space axis, respectively, specifically: input feature Z p Dividing into two branches in the channel dimension, wherein a block branch divides a feature into consecutive non-overlapping partial sequence groups according to a window size wGrid branching extracts super-point features of corresponding positions in parallel from continuous feature windows to obtain sparse global sequence group +.>Then on these two branches, the dense local and sparse global features are encoded using geometric self-attention, respectively, and then the encoded features are stitched again in the channel dimension as follows:
wherein CAT represents feature stitching in channel dimension, blockAtten refers to geometric self-attention computation in block branches, gridAtten refers to geometric self-attention computation in grid branches, MLP represents multi-layer perceptron, normal refers to feature normalization operation, and Move refers to removing artificially added super-point feature sequence.
4. A method according to claim 3, wherein in step (4), the geometric representation of the geometric autonomy calculation is the euclidean distance D between any two superpoints a and b a,b Angle of normal vectorAnd a bi-directional spatial angular distance BSD a,b The formula is as follows:
D a,b =‖a-b‖
wherein,and->Representing normal vectors on the points respectively; the calculation mode of the target point cloud q is the same;
geometric embedding of geometric autonomy calculations is represented as
5. A method according to claim 3, wherein in step (5) features of different scales of the source point cloud p are fused locally and globally by a feature fusion module; the feature fusion module is global geometric self-attention, and the calculation process is as follows:
wherein,respectively representing geometrical characteristics of the i and j th super-points after splicing, < + >>Representing geometrical characteristics of the ith super point after characteristic fusion; w (W) Q ,W K ,W V Is Z p ' mapping matrix, PE i,j The geometric embedding of the ith and j-th super points; c (C) t Representing Z p ' feature dimension; the calculation mode of the target point cloud q is the same.
6. The method according to claim 1, wherein in step (8), the construction of the dense point-to-point relationship is implemented by using a point-to-node aggregation policy, specifically:
the K neighbor algorithm is used for distributing K dense points output by a backbone network decoder to each super point, and multiplying the characteristic matrix of the i-th group of local dense point pairs of the source point cloud p and the target point cloud q to obtain a correlation matrixConsidering that not all points have correspondence, a learnable vector B is added to the correlation matrix +.>The last row and the last column get the expansion matrix +.>Then in the expansion matrix->And iterating for a plurality of times by using a sink horn algorithm to obtain local point pair matching scores, removing the added vector B, selecting M bidirectional maximum score point pairs to obtain local point pair relations, collecting and merging the obtained local point pairs to obtain global point pair relations.
7. The method of claim 1, wherein a point cloud registration model consisting of a backbone network, a super point information interaction module, a super point matching module and a dense point pair construction module is trained, wherein the super point information interaction module consists of a window weight prior module and a multi-axis transducer, and in the training stage, an Adam optimization algorithm is used, the initial learning rate is 0.0001, the learning rate attenuation is 0.95, the weight attenuation is 0.000001, training is performed on a 3DMatch indoor scene point cloud database, and the learning rate attenuation step size is 1; when the point cloud is input into a point cloud registration model, initializing the size of voxels to be 2.5cm, wherein the batch size in an experiment is 1, and performing data enhancement including Gaussian noise, random rotation translation disturbance and point cloud scale scaling; the training period on 3DMatch is 40, the window size is set to 4 in the window weight a priori block, and the window size is set to 4 in the two-axis parallel transform block, which is performed 3 times in succession.
8. A multi-axis transducer point cloud registration device based on local importance priors, comprising: a memory, a processor, and a computer program; wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any of claims 1-7.
9. A computer readable storage medium, characterized in that a computer program is stored, which computer program, when being executed by a processor, implements the method according to any of claims 1-7.
CN202410075232.7A 2024-01-18 2024-01-18 Multi-axis transducer point cloud registration method, device and medium based on local importance priori Pending CN117853748A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410075232.7A CN117853748A (en) 2024-01-18 2024-01-18 Multi-axis transducer point cloud registration method, device and medium based on local importance priori

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410075232.7A CN117853748A (en) 2024-01-18 2024-01-18 Multi-axis transducer point cloud registration method, device and medium based on local importance priori

Publications (1)

Publication Number Publication Date
CN117853748A true CN117853748A (en) 2024-04-09

Family

ID=90530395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410075232.7A Pending CN117853748A (en) 2024-01-18 2024-01-18 Multi-axis transducer point cloud registration method, device and medium based on local importance priori

Country Status (1)

Country Link
CN (1) CN117853748A (en)

Similar Documents

Publication Publication Date Title
US11200424B2 (en) Space-time memory network for locating target object in video content
Bloesch et al. Codeslam—learning a compact, optimisable representation for dense visual slam
US10679075B2 (en) Dense correspondence estimation with multi-level metric learning and hierarchical matching
CN112750140A (en) Disguised target image segmentation method based on information mining
Lee et al. Real-time depth estimation using recurrent CNN with sparse depth cues for SLAM system
CN113298815A (en) Semi-supervised remote sensing image semantic segmentation method and device and computer equipment
CN112990112B (en) Edge-guided cyclic convolution neural network building change detection method and system
EP4246458A1 (en) System for three-dimensional geometric guided student-teacher feature matching (3dg-stfm)
CN111860823A (en) Neural network training method, neural network training device, neural network image processing method, neural network image processing device, neural network image processing equipment and storage medium
CN115035093A (en) Brain tumor self-supervision pre-training method and device based on attention symmetric self-coding
Hwang et al. Self-supervised monocular depth estimation using hybrid transformer encoder
Hwang et al. Lidar depth completion using color-embedded information via knowledge distillation
CN116597336A (en) Video processing method, electronic device, storage medium, and computer program product
CN114577196B (en) Lidar positioning using optical flow
Zeng et al. Deep stereo matching with hysteresis attention and supervised cost volume construction
CN114283265A (en) Unsupervised face correcting method based on 3D rotation modeling
Patel et al. Deep learning-enabled road segmentation and edge-centerline extraction from high-resolution remote sensing images
CN114792401A (en) Training method, device and equipment of behavior recognition model and storage medium
Cheng et al. Two-branch deconvolutional network with application in stereo matching
Yang et al. Learning both matching cost and smoothness constraint for stereo matching
CN116758212A (en) 3D reconstruction method, device, equipment and medium based on self-adaptive denoising algorithm
Cheng et al. Two-branch convolutional sparse representation for stereo matching
CN117853748A (en) Multi-axis transducer point cloud registration method, device and medium based on local importance priori
CN115359508A (en) Performing complex optimization tasks with increased efficiency by expert neuron optimization
Cheng et al. An unsupervised stereo matching cost based on sparse representation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination