CN118172398B - Point cloud registration method and system based on double-layer focusing-attention characteristic interaction - Google Patents
Point cloud registration method and system based on double-layer focusing-attention characteristic interaction Download PDFInfo
- Publication number
- CN118172398B CN118172398B CN202410591494.9A CN202410591494A CN118172398B CN 118172398 B CN118172398 B CN 118172398B CN 202410591494 A CN202410591494 A CN 202410591494A CN 118172398 B CN118172398 B CN 118172398B
- Authority
- CN
- China
- Prior art keywords
- point
- super
- points
- fine
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 128
- 238000000034 method Methods 0.000 title claims abstract description 56
- 230000009466 transformation Effects 0.000 claims abstract description 34
- 238000004364 calculation method Methods 0.000 claims abstract description 14
- 238000005070 sampling Methods 0.000 claims abstract description 8
- 239000002355 dual-layer Substances 0.000 claims description 13
- 239000010410 layer Substances 0.000 claims description 13
- 230000009977 dual effect Effects 0.000 claims description 11
- 238000000605 extraction Methods 0.000 claims description 10
- 230000002708 enhancing effect Effects 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000001514 detection method Methods 0.000 claims description 2
- 239000011159 matrix material Substances 0.000 description 17
- 230000006870 function Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000002776 aggregation Effects 0.000 description 3
- 238000004220 aggregation Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 244000062645 predators Species 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a point cloud registration method and a point cloud registration system based on double-layer focusing-attention characteristic interaction, which relate to the technical field of computer vision and comprise the following steps: sampling and extracting local features of two point clouds to be registered on a coarse scale and a fine scale respectively to obtain super points and corresponding coarse scale features, fine points and corresponding fine scale features; performing feature enhancement on the coarse-scale features of the super points through dense feature interaction and focusing attention feature interaction, and performing super point matching under the double constraint of a feature space and a geometric space to obtain the corresponding relation of the super points; focusing attention feature interaction and feature similarity calculation are carried out on the fine scale features of the fine points, and the corresponding relation of the fine points is obtained; calculating candidate transformation according to the correspondence of the fine points, and selecting final transformation between two point clouds from the candidate transformation; the invention eliminates unreasonable interaction, and simultaneously reserves the context awareness of the super point and the point level so as to establish accurate corresponding relation.
Description
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a point cloud registration method and system based on double-layer focusing-attention characteristic interaction.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Point cloud registration plays an important role in the fields of graphics, computer vision, robots and the like, aims at estimating the relative transformation between two unaligned point cloud segments, and is an important basic task in graphics, computer vision and robot technology; benefiting from the vigorous development of deep learning, a series of methods based on deep learning appear, and compared with the traditional methods, the method has better performance in accuracy and speed; how to reduce the negative impact of the accuracy and robustness of the repeating structure and low overlap point cloud registration remains a problem that has raised widespread community interest.
To solve this problem, the original method is to introduce an attention module on the thick points (super points) to encode the contextual features of the point cloud, which significantly enhances the representation capability of the features; however, simple attention mechanisms ignore the impact of position coding on point cloud information exchange and feature geometry discrimination capabilities. Later GeoTransformer, a geometric transformer introducing the super-point is arranged, relative position information comprising two distances and a triplet angle is encoded into the characteristics of the super-point, and the rotation invariance of the characteristics is enhanced, so that the matching precision between the super-points is improved; nevertheless, the above method uses dense feature interactions, under the influence of which the correct interactions are diluted, resulting in features that approach the average of the features of the repeated area, and in interactions that inevitably introduce erroneous and redundant information interactions that interfere with the discrimination of the learned features; in addition, geoTransformer ignores the feature interactions between the fine points, propagating some accurate super-point correspondences to incorrect point correspondences.
Therefore, the prior method does not deeply discuss the influence of the feature interaction of the coarse scale and the fine scale on the registration performance, and cannot realize effective feature aggregation on the coarse scale and the fine scale in the feature interaction process, so that the geometric consistency and the discernability are damaged, and the registration performance is poor.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a point cloud registration method and a point cloud registration system based on double-layer focusing-attention characteristic interaction, which eliminate unreasonable interaction, and simultaneously reserve context awareness of super points and fine point levels so as to establish an accurate corresponding relation.
To achieve the above object, one or more embodiments of the present invention provide the following technical solutions:
the first aspect of the invention provides a point cloud registration method based on double-layer focusing-attention characteristic interaction.
The point cloud registration method based on double-layer focusing-attention characteristic interaction comprises the following steps:
Sampling and extracting local features of two point clouds to be registered on a coarse scale and a fine scale respectively to obtain super points and corresponding coarse scale features, fine points and corresponding fine scale features;
Performing feature enhancement on the coarse-scale features of the super points through dense feature interaction and focusing attention feature interaction, and performing super-point matching under double constraint of feature space and geometric space based on the enhanced coarse-scale features to obtain the corresponding relation of the super points;
focusing attention feature interaction and feature similarity calculation are carried out on fine scale features of the fine points based on the corresponding relation of the super points, so that the corresponding relation of the fine points is obtained;
calculating candidate transformation according to the correspondence of the fine points, and selecting final transformation between two point clouds from the candidate transformation;
Wherein the dual-layer focus-attention feature interactions include a focus attention feature interaction of coarse-scale features and a focus attention feature interaction of fine-scale features.
Further, the super-point and the corresponding coarse scale feature are specifically:
And (3) carrying out downsampling and local feature extraction on the input point cloud { P, Q } by using a backbone network to obtain super points and corresponding coarse scale features.
Further, the fine points and the corresponding fine scale features are specifically:
The main network gradually upsamples the downsampling points to obtain fine points, and extracts local features for the fine points to obtain fine scale features.
Further, the feature enhancement is performed on the coarse-scale features of the super-point through dense feature interaction and focusing attention feature interaction, specifically:
The method comprises the steps of adopting a super-point focusing attention converter, firstly utilizing a geometric converter to encode global information between the interior of a super-point and the super-point through dense feature interaction, then utilizing a super-point focusing attention module to conduct focusing attention feature interaction, selecting key points from the neighborhood of the super-point, and then enhancing feature discernability and feature consistency of the super-point through interaction with the key points.
Further, the selecting a keypoint from the neighborhood of the super point selects the keypoint using two indexes:
computing the super point With a plurality of adjacent points thereofFeature similarity between the two;
and calculating the significance of the similarity between any two points in the feature space.
Further, the performing the super-point matching under the dual constraint of the feature space and the geometric space to obtain the super-point corresponding relationship specifically includes:
obtaining matching scores by utilizing dual normalization operation and establishing an assumed corresponding relation;
constructing a group of nearest virtual super-point pairs for the assumed corresponding relation in the geometric space;
and carrying out matching detection on the virtual super-point pairs in the feature space, and screening the assumed corresponding relation based on the matching score to obtain the final super-point corresponding relation.
Further, the specific calculation steps of the correspondence relation of the fine points are as follows:
Performing point focusing attention feature interaction on the fine points and the corresponding fine scale features by using a point focusing attention module, enhancing the discrimination capability of the features under the fine scale and obtaining enhanced fine scale features;
and calculating the similarity between the fine scale features, and obtaining the corresponding relation of the fine points based on the matching of the similarity fine points.
A second aspect of the invention provides a point cloud registration system based on dual-layer focus-attention feature interactions.
The point cloud registration system based on double-layer focusing-attention characteristic interaction comprises a characteristic extraction module, a super-point matching module, a fine point matching module and a transformation calculation module:
A feature extraction module configured to: sampling and extracting local features of two point clouds to be registered on a coarse scale and a fine scale respectively to obtain super points and corresponding coarse scale features, fine points and corresponding fine scale features;
The super point matching module is configured to: performing feature enhancement on the coarse-scale features of the super points through dense feature interaction and focusing attention feature interaction, and performing super-point matching under double constraint of feature space and geometric space based on the enhanced coarse-scale features to obtain the corresponding relation of the super points;
A fine point matching module configured to: focusing attention feature interaction and feature similarity calculation are carried out on fine scale features of the fine points based on the corresponding relation of the super points, so that the corresponding relation of the fine points is obtained;
a transform computation module configured to: calculating candidate transformation according to the correspondence of the fine points, and selecting final transformation between two point clouds from the candidate transformation;
Wherein the dual-layer focus-attention feature interactions include a focus attention feature interaction of coarse-scale features and a focus attention feature interaction of fine-scale features.
A third aspect of the invention provides a computer readable storage medium having stored thereon a program which when executed by a processor performs the steps in a point cloud registration method based on dual layer focus-attention feature interaction according to the first aspect of the invention.
A fourth aspect of the invention provides an electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the steps in a point cloud registration method based on dual layer focus-attention feature interaction according to the first aspect of the invention when the program is executed.
The one or more of the above technical solutions have the following beneficial effects:
In order to solve the problem that the geometrical consistency and the discernability are damaged and the registration performance is poor due to potential problems related to dense feature interaction in the prior art, the invention provides a framework named as DFAT, and features are enhanced on a coarse scale and a fine scale by utilizing double-layer focusing-attention feature interaction, namely focusing attention feature interaction on the coarse scale and focusing attention feature interaction on the fine scale are respectively performed by utilizing a super-point focusing attention module and a point focusing attention module, so that more reliable rough matching and fine correspondence are finally realized.
The invention provides a double-space consistency module, which fully utilizes geometric consistency to improve the quality of super point matching.
The invention introduces a linear attention-based module to optimize the fine-point features for better fine correspondence.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a diagram showing a comparison of the conventional method GeoTransformer and the method of the present embodiment.
Fig. 2 is an explanatory diagram of lack of discrimination of features.
Fig. 3 is a flow chart of a method of the first embodiment.
Fig. 4 is a block diagram of the geometric transformer of the first embodiment.
Fig. 5 is a schematic diagram of a dual spatial consistency matching of the first embodiment.
Detailed Description
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
Comparing the prior art method GeoTransformer with the present embodiment method, as shown in fig. 1, visualizing GeoTransformer and the final registration result of the present invention, (a) two point clouds to be registered, (b) a registration result of GeoTransformer, (c) a registration result of the present invention, (d) a red line representing a relationship of a registration error in GeoTransformer, and a green line representing a relationship of a registration error of the present invention, (e) a correspondence relationship that may be wrong between each point of a unified super point pair in the GeoTransformer method, wherein the information of all similar regions is aggregated by super points can be found from (b), (c) and (d) in the graph, such that the generated correspondence relationship lacks distinction and geometric consistency, the phenomenon is illustrated in fig. 2, the cross represents the similar regions, and the background color represents the characteristics of the current region; the dense interaction can result in excessive information interaction between the query super point (cyan) and the super point (red) in the repeated area, so that the generated features approach the average value of the super point features, which indicates that under the influence of the dense interaction, the correct interaction can be diluted, and the generated features approach the average value of the repeated area features; in addition, some noise is also considered in the interaction, thus further limiting the quality of the generated features, and it can be observed from the figure that GeoTransformer ignores the feature interactions between the fine points, thus propagating some accurate super-point correspondence to incorrect point correspondence; as shown in fig. 1 (e), the result of GeoTransformer shows that relying solely on the superspot correspondence and local features to construct the correspondence results in a large number of errors.
In order to solve the problem that the prior art can not realize effective feature aggregation on a coarse scale and a fine scale in the feature interaction process, the invention provides a brand new registration framework, namely a double-Focus-attention feature interaction framework, named Dual Focus-Attention Transformer DFAT for short, which only focuses on the points related to the current point to perform feature interaction and avoids interaction with irrelevant points; for coarse scale, a super-point focus attention transformer guided by sparse keypoints selected from a neighborhood of super-points is introduced; since it avoids redundant interactions from repeatable areas and noise, more discernable features can be obtained, making the coarse correspondence more reliable; for a fine scale, feature interaction is performed in a point set belonging to the same super point to enhance the discrimination capability of features under a finer scale; by these reasonable feature interactions on coarse and fine scales, a more robust registration can be achieved; numerous experiments at 3DMatch, KITTI and enhanced ICL-NUIM have shown that the method of this example achieves the new and most advanced performance on all these.
Example 1
In one embodiment of the present disclosure, a point cloud registration method based on dual-focus-attention feature interaction is provided, based on a newly proposed dual-focus-attention feature interaction framework, named DFAT, to re-aggregate basic context features (i.e., local features) from two levels of super-points (i.e., coarse scale) and fine points (i.e., fine scale), enhancing the discriminativity and local consistency of the features, as shown in fig. 3, comprising the steps of:
Step S1: and respectively sampling and locally extracting the two point clouds to be registered on the coarse scale and the fine scale to obtain the super point and the corresponding coarse scale features, the fine point and the corresponding fine scale features.
The method mainly comprises the steps of performing downsampling on an input point cloud to obtain super points, performing upsampling on the input point cloud to obtain fine points and d-dimensional local context characteristics corresponding to the super points and the fine points.
Specifically, first, the backbone network is utilized to input the point cloud to be registeredSampling and feature extraction of different scales are carried out, the downsampled point of the output is called super point, andRepresentation of their corresponding d-dimensional featuresIn this embodiment, KPConv is used as the backbone network.
After the feature extraction is carried out on the downsampling, the main network gradually upsamples the downsampling points and extracts the features at the same time, then the dense point cloud after the downsampling points are output is called as fine point, the resolution is 1/2 of the resolution of P and Q, and the corresponding d-dimensional features are usedAndAnd (3) representing.
Step S2: and performing feature enhancement on the coarse-scale features of the super points through dense feature interaction and focusing attention feature interaction, and performing super-point matching under double constraint of feature space and geometric space based on the enhanced coarse-scale features to obtain the corresponding relation of the super points.
Specifically, the functions are realized by adopting the super-point focusing attention transformer and the double-space consistency module, the context information is encoded by sequentially executing dense feature interaction and focusing attention feature interaction, so that the features are more discernable, after the super-point focusing attention transformer is repeated for Ls times, the double-space consistency module (Dual-space consistency matching module) realizes accurate super-point matching under the double constraint of the feature space and the geometric space, and a reliable corresponding relation of the super-point level is obtained, and the super-point focusing attention transformer and the double-space consistency module are independently described.
1. Superpoint focus-attention transformer (super point focus attention transducer)
Since global context information plays a critical role in point cloud registration, previous works such as Predator and GeoTransformer utilize graph networks or transformers to encode the dependency relationships inside and between point clouds, however, their dense interactions often lead to performance imperfections as analyzed in fig. 2; in order to solve the problem that dense point interactions can interfere with feature discrimination, error and redundant information interactions are introduced, the embodiment provides a super-point focusing attention converter, namely a super-point focusing attention module is cascaded after a dense feature interaction module so as to enhance local consistency and discrimination, wherein new induction bias, namely sparse key point interaction, is induced.
The process of the super point focus attention converter can be summarized as: firstly, utilizing a geometric transformer to encode global information between the interior of a super point and the super point through dense feature interaction, then utilizing a super point focusing attention module to perform focusing attention feature interaction, selecting a key point from the neighborhood of the super point, and then enhancing the feature discernability and feature consistency of the super point through interaction with the key point, wherein the method specifically comprises the following steps:
(1) Dense feature interaction module
Given input superwordAnd its characteristicsFirstly, the adopted geometric transformer encodes global information inside the super point and between the super points through dense feature interaction.
The geometry transformer is shown in fig. 4 and includes geometry embedding, geometry self-attention modules and cross-attention modules.
The geometrical structure is embedded, namely the distance and the angle calculated by the super point are mainly utilized to encode the position information of the super point; these distances and angles are consistent in different point clouds of the same scene, and the geometry embedding contains two parts: the embedding is based on the paired distances and the embedding is based on the triplet angle.
Embedding based on paired distances: first, two superpoints are calculatedAndThe distance between them is then transformed using a sine function, wherein a super-parameter is introducedTo adjust the sensitivity to distance changes, the specific formula is as follows:
Wherein, Is the characteristic dimension of the wafer, which is,Is a hyper-parameter that controls sensitivity to distance variations,AndIs a position code of two point clouds in pairs.Is the location information of each super point.
Based on triplet angle embedding: computing angle embeddings with triplets of superpoints, first selectingK nearest neighbors of (2)For each of Calculating an angleWherein; And then useUpper sine function calculation triplet angle embeddingWhereinControlling the sensitivity of the angle change; finally, geometry embedding is calculated by aggregating pair-wise distance embedding and triplet angle embeddingExpressed by the formula:
Wherein W D、WA epsilon Is the projection matrix of each of the two embedding types, where maximum pooling is used to improve the robustness of the different nearest neighbors of the super-point due to self-occlusion.
The geometric self-attention module and the cross-attention module respectively encode transformation invariant relative position information between each single point cloud and two point clouds, and finally generate the data containingFeatures of global context informationI.e., global features of P and Q.
Specifically, the geometric self-attention module learns the global correlation of the super-point of each point cloud in the feature space and the geometric space.
The cross attention module is used for carrying out feature exchange between two input point clouds (here, the super point corresponding to each point cloud), and can model the geometric consistency of the two point clouds based on the cross attention module, so that the generated mixed features do not need to search for transformation influence, and have robustness on the obtained corresponding relation.
(2) Super-point focusing attention module
Because of global featuresActive redundant interaction information exists, and the sparse interaction of key points is used for improving the distinguishing capability of the features, so that the problem of redundant information interaction existing above can be solved.
Therefore, on the basis of the super-point features with the global receptive field, the super-point focusing attention module is adopted to perform focusing attention feature interaction, so that the range of the aggregated super-point features is narrowed, and the local consistency is more focused while the global discrimination is maintained, and the specific method is as follows:
firstly, limiting the attention range of the super point to the range of the adjacent point, wherein the calculation method of the adjacent point of the super point is the same as that of the adjacent point in KPConv; because the adjacent points of the super point pairs in the point cloud are not completely overlapped, top-k key points are selected from the adjacent points so as to keep consistency of super point aggregation characteristics.
Two indices are used as criteria for selecting key points:
One of the indexes is to calculate the super point With a plurality of adjacent points thereofFeature similarity among the two is named FSSN, and the expression form is as follows:
Wherein, Representing global features of the ith super point in the point cloud P; representing the ith neighbor point in the point cloud P Is a global feature of (c).
The other index relates to the significance of similarity between any two points in a calculation feature space, and the specific method is to combine the features of two point clouds into a unified whole, and calculate the similarity by using the following formula:
Wherein, 、Representing global features of P and Q.
Finally, the similarity matrix in the neighborhood rangeSignificance matrixMultiplying to obtain importance scores of each adjacent pointThe specific formula is:。
Top-k adjacent matrixes with highest scores are selected as sparse key points And。
Common attention is used for realizing interaction between super points and sparse key points, and enhanced coarse scale features are output。
Specifically, from the super pointExtracting query information Q from the super point, extracting corresponding K, V parameter information from sparse key points in the neighborhood of the super point, extracting neighborhood information of the current super point through a common attention mechanism, further acquiring global information, realizing interaction between the super point and the coefficient key points, and simultaneously adopting the same modeTreatment ofAnd (3) dot sum.
2. Dual-space consistency matching Module (Dual space consistency Module)
The general rough-fine registration method establishes the corresponding relation of the super points through dual normalization operation, but due to partial overlapping and structural repetition, error matching is inevitably generated, and the difficulty of subsequent fine registration is increased; in order to filter the wrong corresponding relation in the early stage, the dual-space consistency module of the embodiment fully utilizes the geometric consistency of the super-point matching stage, filters the unreliable super-point corresponding relation according to the local geometric consistency, and provides more reliable initialization for the follow-up fine matching; the geometric consistency of the super-point matching stage here is that if the assumed super-point correspondence is correct, their neighbors in the geometric space will also exhibit high feature similarity, specifically, as shown in fig. 4, the neighboring points corresponding to the super-points following the local consistency tend to be more reliable.
Given enhanced coarse-scale featuresThe processing steps of the double-space consistency module are as follows:
First, a matching score is obtained by dual normalization operation And establishes a series of assumed corresponding relationsWhereinRepresentation ofThe ith super point of (3)The j-th superspoint in (a).
Wherein the purpose of the dual normalization operation isThe method further suppresses the ambiguous or fuzzy matching and further eliminates the wrong matching, and the specific formula is as follows:
Wherein, Is a Gaussian correlation matrix obtained by normalizing two point clouds.
Obtaining matching score by dual normalization operationSelectingNc item with highest score as corresponding relation of super pointNc here is a specified entry.
Subsequently, a set of nearest virtual is built for the above-mentioned superpoint correspondence in geometric space, i.e. based on pairs obtained by matching scoresThe relationship, the nearest virtual super point pairs are constructed for further screening of these corresponding point pairs.
Then, it is examined in the feature space, and these virtual correspondences are represented by broken lines, respectively, as shown in FIG. 5, specifically, for each superpointRespectively isAndSearching for the nearest super pointAnd。
Then, a corresponding relation is establishedIf (3)Is greater than a thresholdWill remainIf not, giving up the corresponding relation; reserved correspondenceAs a correspondence of the super point.
Step S3: and carrying out focusing attention feature interaction and feature similarity calculation on the fine scale features of the fine points based on the corresponding relation of the super points to obtain the corresponding relation of the fine points.
In general, the fine matching propagates the super-point correspondence, generates dense correspondence and multiple candidate transformations, and the step follows a similar procedure; slightly different from the point matching module in GeoTransformer, this step does not directly learn the distinguishing features of the backbone network (KPConv backbone network)AndInputting to Sinkhorn & Selection layer to obtain fine correspondence; this is because, as shown in fig. 1 (d), even if the super-point correspondence is correct, the point feature may generate an erroneous correspondence due to lack of local context information; in contrast, the present embodiment performs an attention operation between points in the vicinity of the super-point correspondence to encode local context information.
The definition of a point nearby here is: each super-point has a spherical neighborhood of a specified radius, and no more than a specified number of fine points within this neighborhood, called points, are attentive to encode local context information as follows:
Wherein, 、、Queries, keys, and values defining a series of points in attention that belong to the same superset, phi (-) represents relu the activation function.
Therefore, at the point level, a module based on linear attention is introduced as a point focusing attention module, point focusing attention feature interaction is carried out on the fine points and the corresponding fine scale features, the discrimination capability of the features under the fine scale is enhanced, the enhanced fine scale features are obtained, the accuracy of fine matching is further improved, and based on accurate correspondence of the super points and the point level, the DFAT can realize stable point cloud registration.
Specifically, the features of the fine points are input into a point focusing attention module to enhance the discrimination capability of the features at finer scales; point focus attention module repetitionSecondary times; here, linear attention is used as the point focus attention module to avoid excessive computational complexity, formulated as:
Wherein, ,,A query, a key and a value of a series of points belonging to the same super point in attention are defined, phi (-) represents relu an activation function, the series of points of the same super point are fine points in a super point neighborhood, the neighborhood of each super point is a spherical area with a specified radius, and the fine points in the area are neighborhood points of the super point, namely, a series of points of the super point.
Based on the enhanced features of the point focus attention module, the method comprises the following steps:
(1) Calculating the similarity between the features to obtain a similarity matrix;
The idea is to construct a similarity matrix from the features of the fine points after the enhancement by the point focusing attention module by the following formula :
Wherein,、Features representing enhanced points in the point clouds P and Q,The length of the feature is indicated and,To satisfy two constraints, namelyThe number of the P super points of the behavior point cloud is listed as the number of the Q super points of the point cloud, and the two constraints are conditions for meeting Sinkhorn algorithm.
(2) Optimizing the similarity matrix using Sinkhorn algorithm and outputting the assignment matrix;
The similarity matrix obtained in the last step is passed through Sinkhorn to obtain an allocation matrix(Also referred to as a matrix of assignments).
(3) From the slaveSelecting m corresponding points to form a point corresponding setThe selection criteria is based on the size of the elements in the allocation matrix.
UsingAs confidence matrix of candidate match, extracting point corresponding relation through mutual top-m selection, wherein if the point match is located in m largest items of row and column, then selecting the point match; the specific selection formula is as follows:
Wherein, 、Local features representing point correspondence of the point cloud P and the point cloud Q,Corresponding to the correspondence of two dense point clouds,Is a global set of dense correspondences,The matrix (confidence matrix) is assigned.
(4) All fine correspondences are combined into a final dense correspondencesWhereinRepresenting the number of superset correspondences.
The calculated point correspondences from each super point match are aggregated together to form a final global dense point correspondenceWhereinIndicating the number of corresponding super points.
After this process, the registration point pair with the highest similarity between the two point clouds is obtained.
Step S4: and calculating candidate transformation according to the correspondence of the fine points, and selecting final transformation between the two point clouds from the candidate transformation.
To avoid slow convergence and instability of RANSAC iterations, a local-to-global registration scheme is used to estimate the final output transform, in particular:
first, given a correspondence C of fine points and a fine correspondence set The correspondence of the fine points is a set of fine correspondence sets, and the correspondence of the fine points is formed by bringing together the left and right fine correspondence sets.
Next, in the local phase, for each input correspondence setA candidate transformation is calculated by weighted SVDThe method specifically comprises the following steps:
The formula is to solve a rotation matrix Ri and a translation matrix ti which conform to the formula according to each corresponding set, and bring the obtained Ri and ti into the following formula for hypothesis testing.
Finally, in the global stage, the inlier count of each candidate transformation is calculated, and the transformation with the largest inlier count is selected as the final transformation。
The following formula is to calculate and select the transform with the largest inner point count as the final transformIs defined by the formula:
After the feature interaction of the rough-to-fine point cloud registration method on rough and fine scales is deeply analyzed, redundant information is introduced to the feature interaction of the rough scales, which is found to be dense in the past, so that the methods are prevented from realizing stable performance; to solve this problem, the present embodiment concatenates a super-point focus attention module with sparse interactions after dense interactions. Through mixed feature interaction, more reliable rough matching can be realized; for fine-scale feature interaction, a single-point focusing attention module is added to simulate local context information, so that false fine correspondence is avoided.
Two experiments were performed for the method of this embodiment, the first experiment used data sets of 3DMatch and 3DLoMatch, the difference between the two data sets is that the overlapping rate between the 3DMatch point clouds exceeds 30%, the overlapping rate of 3DLoMatch ranges from 10% to 30%, and the evaluation results of 3DMatch and 3DLoMatch are shown in table 1:
Table 1 evaluation results of 3DMatch and 3DLoMatch
Results as shown in table 1, the method of this example achieves optimal performance at 3DMatch and 3DLoMatch, which is superior to the previously best competitors 3DMatch at 3.4 pp and 3DLoMatch at 2.8 pp, verifying their efficacy in terms of high and low overlap rates; more importantly, by combining new induced bias, namely sparse feature interaction, the method successfully improves the performance of GeoTransformer, and the importance of avoiding unreasonable feature interaction is shown no matter whether PEAL exists or not; furthermore, in a combined setup with PEAL, the present method enables better registration recall with fewer iterations.
The second experimental dataset is a KITTI, which is one of the most well known automated driving outdoor datasets, typically used to evaluate the performance of point cloud registration; training 0-5 scenes, verifying 6-7 scenes and testing 8-10 scenes according to official division of the training machine; because the ground truth transformation captured by the GPS contains a certain error, the ground truth transformation is refined by using an ICP algorithm like most works; KITTI the evaluation results are shown in table 2:
table 2 KITTI evaluation results
The results are shown in Table 2, the method realizes the best performance on all indexes, and by combining new inductive deviation (namely sparse feature interaction) and local feature interaction, the method successfully helps GeoTransformer to realize more accurate registration; the experimental result verifies that reasonable feature interaction plays an important role in point cloud registration.
Example two
In one embodiment of the disclosure, a point cloud registration system based on dual-layer focus-attention feature interaction is provided, which comprises a feature extraction module, a super-point matching module, a fine-point matching module and a transformation calculation module:
A feature extraction module configured to: sampling and extracting local features of two point clouds to be registered on a coarse scale and a fine scale respectively to obtain super points and corresponding coarse scale features, fine points and corresponding fine scale features;
The super point matching module is configured to: performing feature enhancement on the coarse-scale features of the super points through dense feature interaction and focusing attention feature interaction, and performing super-point matching under double constraint of feature space and geometric space based on the enhanced coarse-scale features to obtain the corresponding relation of the super points;
A fine point matching module configured to: focusing attention feature interaction and feature similarity calculation are carried out on fine scale features of the fine points based on the corresponding relation of the super points, so that the corresponding relation of the fine points is obtained;
a transform computation module configured to: calculating candidate transformation according to the correspondence of the fine points, and selecting final transformation between two point clouds from the candidate transformation;
wherein the dual-layer focus-attention feature interactions include a focused attention feature interaction of coarse-scale features and a focused attention feature interaction of fine-scale features.
Example III
An object of the present embodiment is to provide a computer-readable storage medium.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps in a point cloud registration method based on dual layer focus-attention feature interaction as described in embodiment one of the present disclosure.
Example IV
An object of the present embodiment is to provide an electronic apparatus.
An electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the steps in a point cloud registration method based on dual focal point-attention feature interaction as described in embodiment one of the present disclosure when the program is executed.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. The point cloud registration method based on double-layer focusing-attention characteristic interaction is characterized by comprising the following steps of:
Sampling and extracting local features of two point clouds to be registered on a coarse scale and a fine scale respectively to obtain super points and corresponding coarse scale features, fine points and corresponding fine scale features;
Performing feature enhancement on the coarse-scale features of the super points through dense feature interaction and focusing attention feature interaction, and performing super-point matching under double constraint of feature space and geometric space based on the enhanced coarse-scale features to obtain the corresponding relation of the super points;
focusing attention feature interaction and feature similarity calculation are carried out on fine scale features of the fine points based on the corresponding relation of the super points, so that the corresponding relation of the fine points is obtained;
calculating candidate transformation according to the correspondence of the fine points, and selecting final transformation between two point clouds from the candidate transformation;
Wherein the dual-layer focus-attention feature interactions include a focus attention feature interaction of coarse-scale features and a focus attention feature interaction of fine-scale features.
2. The point cloud registration method based on dual-layer focusing-attention feature interaction according to claim 1, wherein the super-point and the corresponding coarse-scale feature are specifically:
And (3) carrying out downsampling and local feature extraction on the input point cloud { P, Q } by using a backbone network to obtain super points and corresponding coarse scale features.
3. The point cloud registration method based on dual-layer focusing-attention feature interaction according to claim 1, wherein the fine points and the corresponding fine scale features are specifically:
The main network gradually upsamples the downsampling points to obtain fine points, and extracts local features for the fine points to obtain fine scale features.
4. The point cloud registration method based on double-layer focusing-attention feature interaction according to claim 1, wherein the feature enhancement is performed on the coarse-scale features of the super-points through dense feature interaction and focusing attention feature interaction, specifically:
The method comprises the steps of adopting a super-point focusing attention converter, firstly utilizing a geometric converter to encode global information between the interior of a super-point and the super-point through dense feature interaction, then utilizing a super-point focusing attention module to conduct focusing attention feature interaction, selecting key points from the neighborhood of the super-point, and then enhancing feature discernability and feature consistency of the super-point through interaction with the key points.
5. The method for point cloud registration based on dual-layer focus-attention feature interaction of claim 4, wherein the selecting a keypoint from a neighborhood of super points selects a keypoint using two criteria:
computing the super point With a plurality of adjacent points thereofFeature similarity between the two;
and calculating the significance of the similarity between any two points in the feature space.
6. The point cloud registration method based on double-layer focusing-attention feature interaction according to claim 1, wherein the performing the super-point matching under the double constraint of the feature space and the geometric space to obtain the super-point corresponding relation is specifically as follows:
obtaining matching scores by utilizing dual normalization operation and establishing an assumed corresponding relation;
constructing a group of nearest virtual super-point pairs for the assumed corresponding relation in the geometric space;
and carrying out matching detection on the virtual super-point pairs in the feature space, and screening the assumed corresponding relation based on the matching score to obtain the final super-point corresponding relation.
7. The point cloud registration method based on double-layer focusing-attention feature interaction as claimed in claim 1, wherein the fine point correspondence is calculated by the following steps:
Performing point focusing attention feature interaction on the fine points and the corresponding fine scale features by using a point focusing attention module, enhancing the discrimination capability of the features under the fine scale and obtaining enhanced fine scale features;
and calculating the similarity between the fine scale features, and obtaining the corresponding relation of the fine points based on the matching of the similarity fine points.
8. The point cloud registration system based on double-layer focusing-attention characteristic interaction is characterized by comprising a characteristic extraction module, a super-point matching module, a fine point matching module and a transformation calculation module:
A feature extraction module configured to: sampling and extracting local features of two point clouds to be registered on a coarse scale and a fine scale respectively to obtain super points and corresponding coarse scale features, fine points and corresponding fine scale features;
The super point matching module is configured to: performing feature enhancement on the coarse-scale features of the super points through dense feature interaction and focusing attention feature interaction, and performing super-point matching under double constraint of feature space and geometric space based on the enhanced coarse-scale features to obtain the corresponding relation of the super points;
A fine point matching module configured to: focusing attention feature interaction and feature similarity calculation are carried out on fine scale features of the fine points based on the corresponding relation of the super points, so that the corresponding relation of the fine points is obtained;
a transform computation module configured to: calculating candidate transformation according to the correspondence of the fine points, and selecting final transformation between two point clouds from the candidate transformation;
Wherein the dual-layer focus-attention feature interactions include a focus attention feature interaction of coarse-scale features and a focus attention feature interaction of fine-scale features.
9. An electronic device, comprising:
a memory for non-transitory storage of computer readable instructions;
A processor for executing the computer readable instructions;
wherein the computer readable instructions, when executed by the processor, perform the method of any of the preceding claims 1-7.
10. A storage medium, characterized by non-transitory storing computer readable instructions, wherein the computer readable instructions, when executed by a computer, perform the method of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410591494.9A CN118172398B (en) | 2024-05-14 | 2024-05-14 | Point cloud registration method and system based on double-layer focusing-attention characteristic interaction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410591494.9A CN118172398B (en) | 2024-05-14 | 2024-05-14 | Point cloud registration method and system based on double-layer focusing-attention characteristic interaction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118172398A CN118172398A (en) | 2024-06-11 |
CN118172398B true CN118172398B (en) | 2024-07-26 |
Family
ID=91358734
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410591494.9A Active CN118172398B (en) | 2024-05-14 | 2024-05-14 | Point cloud registration method and system based on double-layer focusing-attention characteristic interaction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118172398B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115439694A (en) * | 2022-09-19 | 2022-12-06 | 南京邮电大学 | High-precision point cloud completion method and device based on deep learning |
CN117392424A (en) * | 2022-06-29 | 2024-01-12 | 上海理工大学 | Three-dimensional point cloud classification method based on multi-geometric double-edge attention network |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112966696B (en) * | 2021-02-05 | 2023-10-27 | 中国科学院深圳先进技术研究院 | Method, device, equipment and storage medium for processing three-dimensional point cloud |
CN117670955A (en) * | 2023-12-07 | 2024-03-08 | 北京工业大学 | Cross-modal point cloud fine registration method applied to biped humanoid robot |
-
2024
- 2024-05-14 CN CN202410591494.9A patent/CN118172398B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117392424A (en) * | 2022-06-29 | 2024-01-12 | 上海理工大学 | Three-dimensional point cloud classification method based on multi-geometric double-edge attention network |
CN115439694A (en) * | 2022-09-19 | 2022-12-06 | 南京邮电大学 | High-precision point cloud completion method and device based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN118172398A (en) | 2024-06-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cong et al. | Does thermal really always matter for RGB-T salient object detection? | |
CN113239131B (en) | Low-sample knowledge graph completion method based on meta-learning | |
CN105844669A (en) | Video target real-time tracking method based on partial Hash features | |
CN112150523B (en) | Three-dimensional point cloud registration method with low overlapping rate | |
Su et al. | Uncertainty guided multi-view stereo network for depth estimation | |
CN116188543A (en) | Point cloud registration method and system based on deep learning unsupervised | |
Song et al. | A novel partial point cloud registration method based on graph attention network | |
CN115631341A (en) | Point cloud registration method and system based on multi-scale feature voting | |
CN107153839A (en) | A kind of high-spectrum image dimensionality reduction processing method | |
CN116128944A (en) | Three-dimensional point cloud registration method based on feature interaction and reliable corresponding relation estimation | |
CN116229222A (en) | Light field saliency target detection method and device based on implicit graph learning | |
CN114782503A (en) | Point cloud registration method and system based on multi-scale feature similarity constraint | |
CN117671666A (en) | Target identification method based on self-adaptive graph convolution neural network | |
CN117351194A (en) | Graffiti type weak supervision significance target detection method based on complementary graph inference network | |
CN118172398B (en) | Point cloud registration method and system based on double-layer focusing-attention characteristic interaction | |
CN103208003B (en) | Geometric graphic feature point-based method for establishing shape descriptor | |
CN116109649A (en) | 3D point cloud instance segmentation method based on semantic error correction | |
CN114022521B (en) | Registration method and system for non-rigid multimode medical image | |
CN115035377A (en) | Significance detection network system based on double-stream coding and interactive decoding | |
Xie et al. | S2H-GNN: Learning Soft to Hard Feature Matching with Sparsified Graph Neural Network | |
Zhao et al. | Salient Object Detection Based on Transformer and Multi-scale Feature Fusion | |
CN113688700B (en) | Real domain three-dimensional point cloud object identification method based on hierarchical attention sampling strategy | |
KR102593031B1 (en) | Neural matching representation method and apparatus for visual correspondence | |
CN118135405B (en) | Optical remote sensing image road extraction method and system based on self-attention mechanism | |
Dai et al. | CTTSR: A hybrid CNN-transformer network for scene text image super-resolution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |