CN113486988B - Point cloud completion device and method based on adaptive self-attention transformation network - Google Patents

Point cloud completion device and method based on adaptive self-attention transformation network Download PDF

Info

Publication number
CN113486988B
CN113486988B CN202110890669.2A CN202110890669A CN113486988B CN 113486988 B CN113486988 B CN 113486988B CN 202110890669 A CN202110890669 A CN 202110890669A CN 113486988 B CN113486988 B CN 113486988B
Authority
CN
China
Prior art keywords
point cloud
convolution
point
sampling
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110890669.2A
Other languages
Chinese (zh)
Other versions
CN113486988A (en
Inventor
高子淇
刘文印
陈俊洪
梁达勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202110890669.2A priority Critical patent/CN113486988B/en
Publication of CN113486988A publication Critical patent/CN113486988A/en
Application granted granted Critical
Publication of CN113486988B publication Critical patent/CN113486988B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a point cloud completion device and method based on an adaptive self-attention transformation network, which comprises the following steps: the point cloud sampling module is used for carrying out point cloud sampling twice to obtain two layers of fused point cloud spatial information characteristics; the adaptive self-attention transformation module is used for carrying out adaptive feature fusion according to the point cloud spatial information features; and the completion module is used for performing cloud point completion according to the output result of the adaptive self-attention transformation module. By adopting the technical scheme of the invention, the light-weight calculation amount and the effectiveness of the multiple feature fusion are ensured.

Description

Point cloud completion device and method based on adaptive self-attention transformation network
Technical Field
The invention belongs to the technical field of deep learning, and particularly relates to a point cloud completion device and method based on feature adaptability and a self-attention transformation network.
Background
With the wide application of three-dimensional vision in the fields of robots, autonomous driving and augmented reality, point clouds are widely used as a data representation form with small data volume and fineness. At present, the point cloud data is generally acquired by using a laser radar, a binocular stereo camera or a low-resolution RGB-D radar. However, due to the influence of environmental factors, complete point cloud data cannot be obtained generally, and completing incomplete point clouds becomes an important task. Therefore, under the condition of limited hardware conditions, the point cloud repairing and completing work based on deep learning is the key and the basis of the subsequent point cloud related task. The point cloud is a massive point set of target surface characteristics, and the object surface characteristics are described through the spatial information of the point xyz axis. In the processing process, a large amount of information calculation and feature fusion become the key problems of instant point cloud completion. For example, only three-dimensional surface features of an object part under a visual angle can be extracted under a depth camera of a robot in the existing three-dimensional vision; the problem of missing in point cloud reconstruction of surrounding objects through vision in autonomous driving is that the sizes and volumes of the surrounding objects and the relative distances between the objects cannot be accurately estimated; in summary, the existing point cloud completion network cannot complete the missing surface features of the object obtained by the robot camera in real time.
Disclosure of Invention
The invention provides a point cloud complementing device and method based on an adaptive self-attention transformation network, which are used for point cloud complementing by using a simple and efficient self-attention transformation network, and meanwhile, in order to ensure the flexibility of the network, adaptive networks of different operators are designed for different characteristics, and the integration of the two ensures the light-weight calculated amount and the effectiveness of the multiple characteristic integration.
In order to achieve the purpose, the invention adopts the following technical scheme:
a point cloud completion device based on an adaptive attention transformation network comprises:
the point cloud sampling module is used for carrying out point cloud sampling twice to obtain two layers of fused point cloud spatial information characteristics;
the adaptive self-attention transformation module is used for carrying out adaptive feature fusion according to the point cloud spatial information features;
the complementing module is used for complementing cloud points according to the output result of the adaptive self-attention transformation module;
wherein the point cloud refers to a collection of target surface characteristics that represent an apparent surface of an object; the target surface characteristics are object surface characteristics obtained by scanning through a depth camera in the process of robot vision grabbing, or vehicle surface characteristics obtained in the process of scanning surrounding vehicles through the depth camera in an unmanned mode.
Preferably, the point cloud sampling module includes:
the acquisition unit is used for acquiring relative point cloud coordinates formed by the object with a central origin;
the mapping unit is used for mapping the N point cloud coordinates to a high dimension through convolution, firstly calculating the xyz coordinates of the 3-dimensional object into 64-dimensional vectors through convolution calculation, expressing each point by using one 64-dimensional vector, and then calculating the 64-dimensional vectors into 128-dimensional vectors through one convolution calculation to express so as to obtain global point cloud space position information;
the clustering unit is used for acquiring the distance of all other points corresponding to each point through sampling of the farthest point, sequencing the distances of all the points corresponding to each point, then selecting the features of k points closest to each point through a k adjacent point algorithm, packaging the features into a k multiplied by 128 matrix to form a group, acquiring 1024 groups through the clustering unit for the first time, and acquiring 512 groups through the clustering unit for the second time;
and the extraction unit is used for performing feature fusion by using twice identical convolution calculation to obtain key information, extracting the maximum point cloud space information features of the k multiplied by 128 matrix on k rows by using the maximum pooling layer, and finally forming a 1 multiplied by 128 vector.
Preferably, the adaptive attention transformation module includes:
the first calculation unit is used for calculating the characteristic information through a layer of convolution to obtain a K vector according to the point cloud space information characteristics extracted by the point cloud sampling module, reducing the vector dimension to one fourth, calculating the characteristic information through the same layer of convolution to obtain a Q vector, and calculating a V vector through a layer of convolution without changing the dimension;
the second calculation unit is used for multiplying the Q vector and the K vector to obtain a scoring vector grouped for each point, and then using a soft maximum layer to normalize the scoring vector into an attention map, so that the sum of the scoring vector and the attention map is 1; dividing the attention diagram by the sum of the attention diagram, carrying out normalization again, and carrying out weighted summation on the attention diagram and the V vector to obtain an attention diagram of the fusion feature; finally, subtracting the characteristics which are not fused at the beginning from the fused characteristics at the end to obtain relative spatial characteristic information;
the third calculation unit is used for pre-designing n different convolution kernels to be placed in a convolution pool, firstly placing the relative spatial feature information obtained by the previous calculation unit into an independent scoring network, changing the feature map into n dimensions through one layer of convolution calculation, generating weights for the n convolution kernels, and then normalizing the weights by using a soft maximum layer to enable the sum to be 1; multiplying the weight of each convolution kernel obtained from the scoring network by the convolution kernel, then adding the weight and the convolution kernel into a complete convolution kernel, and performing the last convolution calculation on the relative spatial feature information obtained by the last calculation unit and the convolution kernel to obtain feature-fused spatial feature information; and adding the point cloud spatial information features input from the point cloud sampling module at first and the spatial feature information after feature fusion to obtain complete spatial feature information for point cloud generation.
Preferably, the completion module utilizes the point cloud fusion characteristics of each layer to generate three sections of completion point clouds through an MLP structure; respectively taking the sampling features of the first point cloud sampling layer and the second point cloud sampling layer as the input of two layers of adaptive self-attention transformation modules, and respectively taking the final output dimensions of N/4 x 512 and N/2 x 256; outputting the sampling characteristics of the first point cloud sampling layer through two layers of adaptive self-attention transformation modules for point cloud generation, and reducing 256 dimensions into 3-dimensional xyz space point cloud coordinates by using convolution calculation to obtain integral object point cloud completion; respectively obtaining two outputs by the sampling features of a second point cloud sampling layer through two layers of adaptive self-attention transformation modules, wherein the two outputs are used for generating point clouds, and 512-dimensional space point cloud coordinates are reduced into 3-dimensional xyz space point cloud coordinates by using convolution calculation to obtain point cloud completion of local details of an object; and generating point cloud numbers of N/4 x 3, N/4 x 3 and N/2 x 3 respectively by the three-layer MLP structure, and finally fusing and outputting N x 3 point clouds.
The invention also provides a point cloud completion method based on the self-attention transformation network, which comprises the following steps:
s1, performing two-time point cloud sampling to obtain two-layer fused point cloud spatial information characteristics;
step S2, performing adaptive feature fusion according to the point cloud spatial information features;
s3, according to the result of the adaptability characteristic fusion, cloud point completion is achieved;
wherein the point cloud refers to a collection of target surface characteristics that represent an apparent surface of an object; the target surface characteristics are object surface characteristics obtained by scanning through a depth camera in the process of robot vision grabbing, or vehicle surface characteristics obtained in the process of scanning surrounding vehicles through the depth camera in an unmanned mode.
Preferably, in step 1, the cloud coordinates of the N points are mapped to a high dimension by convolution to obtain space position information of the global point cloud, then the distance of all other points corresponding to each point is obtained by sampling the farthest point, and then k points closest to each point are clustered by k adjacent point algorithms to form a group; dividing the data into 1024 groups, sampling and grouping again, and dividing into 512 groups; and finally, extracting key useful point cloud spatial information features by adopting a maximum pooling layer.
Preferably, in the step 2, by embedding point cloud space information features, a K vector, a Q vector and a V vector of each grouping space feature are respectively calculated by using three-layer convolution; multiplying the Q vector and the K vector to obtain a scoring vector for grouping each point, multiplying the scoring vector and the V vector to obtain fusion characteristics, and performing final point cloud convolution operation by using a characteristic difference, wherein the adaptive network is applied to a final convolution layer, and the method specifically comprises the following steps of: n different convolution kernels are designed in advance and placed in a convolution pool, firstly, grouping space features are placed in a scoring network, the features are mapped into n dimensions through a layer of convolution, and weights for the n convolution kernels are generated; the weights obtained from the scoring network for each convolution kernel are multiplied by the convolution kernel and finally summed to a complete convolution kernel for convolution calculation.
Preferably, in step 3, three segments of completion point clouds are generated through an MLP structure by utilizing the point cloud fusion characteristics of each layer; respectively taking the sampling features of the first point cloud sampling layer and the second point cloud sampling layer as the input of two layers of adaptive self-attention transformation modules, and respectively taking the final output dimensions of N/4 x 512 and N/2 x 256; outputting the sampling characteristics of the first point cloud sampling layer through two layers of adaptive self-attention transformation modules for point cloud generation, and reducing 256 dimensions into 3-dimensional xyz space point cloud coordinates by using convolution calculation to obtain integral object point cloud completion; and respectively obtaining two outputs by the sampling features of the second point cloud sampling layer through two layers of adaptive self-attention transformation modules, wherein the two outputs are used for generating point clouds, and 512-dimensional points are reduced into 3-dimensional xyz space point cloud coordinates by using convolution calculation to obtain point cloud completion of local details of the object.
Preferably, in step 3, the three-layer MLP structure generates point cloud numbers N/4 × 3, and N/2 × 3, respectively, and finally fuses and outputs N × 3 point clouds.
The point cloud completion method and device based on the adaptive self-attention transformation network greatly reduce convolution operation by using simple linear matrix operation, reduce network parameters, accelerate the calculation efficiency of the network, and simultaneously ensure the high precision of point cloud completion by utilizing the fusion of adaptive characteristics.
Drawings
FIG. 1 is a schematic structural diagram of a point cloud completion device based on an adaptive attention-changing network;
FIG. 2 is a schematic diagram of a point cloud sampling module;
FIG. 3 is a schematic diagram of an adaptive attention transform module;
FIG. 4 is a schematic diagram of a third computing unit in the adaptive attention transform module;
FIG. 5 is a schematic structural diagram of a completion module;
FIG. 6 is a flow chart of a point cloud completion method based on feature adaptive and self-attention transformation networks.
Detailed Description
The present invention will be described in further detail with reference to the following detailed description and accompanying drawings.
As shown in FIG. 1, the invention provides a point cloud complementing device based on feature adaptability and self-attention transformation network, and the aim of the invention is to complement complete point cloud from partial point cloud. The point cloud refers to a collection of target surface characteristics that represent the apparent surface of an object; the target surface characteristics are object surface characteristics obtained by scanning through a depth camera in the process of robot vision grabbing, or vehicle surface characteristics obtained in the process of scanning surrounding vehicles through the depth camera in an unmanned mode. The spatial information of the point cloud is represented by [ x, y, z ], and represents the coordinates of each point in a three-dimensional coordinate system. It includes: the system comprises a point cloud sampling module, an adaptive self-attention transformation module and a completion module. The point cloud sampling module is used for sampling point clouds twice to obtain two layers of fused point cloud spatial information characteristics; the adaptive self-attention transformation module is used for carrying out adaptive feature fusion according to the point cloud spatial information features; and the completion module is used for realizing cloud point completion according to the output of the adaptive self-attention transformation module.
As shown in fig. 2, the point cloud sampling module maps N point cloud coordinates to a high dimension by convolution to obtain global point cloud spatial position information, then obtains distances of all other points corresponding to each point by farthest point sampling, and then clusters k points closest to each point by k adjacent point algorithms to form a group. After dividing it into 1024 packets, sampling and grouping are performed again, and then divided into 512 packets. After each grouping, in order to better fuse the information of each point in each grouping, two layers of convolution calculation are used for carrying out feature fusion. And finally extracting key useful point cloud characteristic information by adopting a maximum pooling layer. Each grouping thus implies the spatial signature of a number of nearby points.
Further, a point cloud sampling module, comprising:
the system comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring relative point cloud coordinates of an object formed by a central origin, and the point cloud is a massive point set of target surface characteristics and represents the appearance surface of the object;
the mapping unit is used for mapping the N point cloud coordinates to a high dimension through convolution, firstly calculating the xyz coordinates of the 3-dimensional object into 64-dimensional vectors through convolution calculation, expressing each point by using one 64-dimensional vector, and then calculating the 64-dimensional vectors into 128-dimensional vectors through one convolution calculation to express, so that the spatial position information of the global point cloud is obtained at this time;
the clustering unit acquires the distances of all other points corresponding to each point through sampling of the farthest point, sorts the distances of all points corresponding to each point, selects the features of k points closest to each point through a k adjacent point algorithm, packages the features into a k multiplied by 128 matrix to form a group, and acquires 1024 groups for the first time and 512 groups for the second time through the method.
And the extraction unit is used for performing feature fusion by using twice identical convolution calculation to obtain key information in order to change each k × 128 grouped matrix into a 128-dimensional vector after each grouping, extracting the maximum value of the key information on k rows by using the maximum pooling layer, and finally forming a 1 × 128 vector.
The point cloud sampling module adopts twice point cloud sampling to obtain point cloud characteristics under different resolutions, and the point cloud spatial information characteristics fused by the two layers are used for the next step of characteristic fusion and point cloud generation.
As shown in fig. 3, the adaptive attention transformation module applies the attention transformation network for natural language processing to point cloud completion, and the feature adaptive module makes up for the deficiency of the attention transformation network in feature fusion. As shown in fig. 3, by embedding the point cloud spatial features, K, Q, and V vectors of each packet spatial feature are calculated by applying a triple layer convolution. And multiplying the Q vector and the K vector to obtain a scoring vector for each point group, multiplying the scoring vector and the V vector to obtain a fusion feature, fusing the information of the grouping space features of other points in each point grouping feature, and finally performing final point cloud convolution operation by using a feature difference. Because the small calculation amount of matrix operation inevitably leads to data processing imperfection, and in order to have more flexible convolution processing with better adaptability for different types of point cloud characteristics, the adaptive network is innovatively applied to the final convolution layer. As shown in fig. 4, n different convolution kernels are designed in advance and placed in a convolution pool, the grouping space features are placed in a scoring network first, and the features are mapped into n dimensions through one layer of convolution, so that weights for the n convolution kernels are generated. Since it is by using the features directly for obtaining the weight, this weight only focuses on how to better process the current packet spatial feature information. The weights obtained from the scoring network for each convolution kernel are multiplied by the convolution kernel and finally summed to a complete convolution kernel for convolution calculation. Therefore, only one part of each convolution kernel is adopted, the weights of the grouping features with different distributions can be correspondingly adjusted to obtain more proper convolution kernels for convolution calculation, the effect of multilayer convolution is realized by using one layer of convolution calculation, and the calculation feature information is better processed. The process does not increase more convolution calculation, enhances the flexibility and adaptability of the network to data, and improves the effectiveness of the characteristic information.
Further, an adaptive attention transformation module comprising:
the first calculation unit is used for calculating the characteristic information through a layer of convolution to obtain a K vector according to the point cloud characteristic information extracted by the point cloud sampling module, wherein the vector dimension is reduced to one fourth through the convolution calculation, then a Q vector is obtained through the characteristic information through the same layer of convolution calculation, and a V vector is obtained through the convolution calculation without changing the dimension;
and the second calculation unit is used for multiplying the Q vector and the K vector to obtain a scoring vector grouped for each point, and then normalizing the scoring vectors into an attention map by using a soft maximum layer to enable the sum to be 1. Dividing the attention diagram by the sum of the attention diagram, carrying out normalization again, carrying out weighted summation on the attention diagram and the V vector to obtain an attention diagram of fused features, fusing information of grouped spatial features of other points for each grouped feature, and finally subtracting the feature which is not fused at the beginning from the fused feature to obtain relative spatial feature information;
a third calculating unit, as shown in fig. 4, is configured to improve the last layer of convolution in the self-attention mechanism, create a feature adaptive convolution, pre-design n different convolution kernels to be placed in a convolution pool, place the relative spatial feature information obtained by the previous calculating unit into an independent scoring network, change the feature map into n dimensions through a layer of convolution calculation, generate weights for the n convolution kernels, and normalize the weights by using a soft maximum layer, so that the sum is 1. This weight determines how much of the parameter is used in each convolution kernel to determine how much each convolution kernel is in proportion to the last added convolution kernel. Since this weight is obtained by directly using the relative feature map of the last computing unit, the weight only focuses on how to better process the current packet spatial feature information. The weights for each convolution kernel obtained from the scoring network are multiplied by the convolution kernel and finally summed to form a complete convolution kernel. And performing the last convolution calculation on the relative spatial feature information obtained by the last calculation unit and the convolution kernel to obtain feature-fused spatial feature information. And finally, adding the spatial feature information input from the point cloud sampling module at first and the spatial feature information fused with the last features to obtain complete spatial feature information for generating the point cloud. Therefore, only a part of each convolution kernel is adopted, the weights of the grouping features with different distributions can be correspondingly adjusted to obtain more proper convolution kernels for convolution calculation, the effect of multilayer convolution is realized by using one layer of convolution calculation, and the calculation feature information is better processed. The process does not increase more convolution calculation, enhances the flexibility and adaptability of the network to data, and improves the effectiveness of the characteristic information.
And the completion module is used for realizing cloud point completion according to the output of the adaptive self-attention transformation module, extracting and fusing the point cloud space characteristics of the two layers of structures, and generating three sections of completion point clouds through an MLP structure by utilizing the point cloud fusion characteristics of each layer. As shown in fig. 5, the sampling features of the first point cloud sampling layer and the second point cloud sampling layer are used as the input of the two adaptive attention transformation modules, and the final output dimensions are N/4 × 512 and N/2 × 256, respectively. And outputting the sampling characteristics of the first point cloud sampling layer through two layers of adaptive self-attention transformation modules for point cloud generation, and reducing 256 dimensions into 3-dimensional xyz space point cloud coordinates by using convolution calculation to obtain the whole object point cloud completion. And respectively obtaining two outputs by the sampling features of the second point cloud sampling layer through two layers of adaptive self-attention transformation modules, wherein the two outputs are used for generating point clouds, and 512-dimensional points are reduced into 3-dimensional xyz space point cloud coordinates by using convolution calculation to obtain point cloud completion of local details of the object. And generating point cloud numbers of N/4 x 3, N/4 x 3 and N/2 x 3 respectively by the three-layer MLP structure, and finally fusing and outputting N x 3 point clouds.
As shown in fig. 6, the present invention also provides a point cloud completion method based on feature adaptability and self-attention transformation network, and the object of the present invention is to complete a complete point cloud from partial point clouds, the point cloud refers to a set of surface characteristics of a target, which represents the appearance surface of an object; the target surface characteristic is an object surface characteristic obtained by scanning through a depth camera in the process of robot vision grabbing, or a vehicle surface characteristic obtained in the process of scanning surrounding vehicles through the depth camera in an unmanned manner; the completion method comprises the following steps:
s1, performing two-time point cloud sampling to obtain two-layer fused point cloud spatial information characteristics;
step S2, performing adaptive feature fusion according to the point cloud spatial information features;
and step S3, according to the result of the adaptive feature fusion, cloud point completion is realized.
On the basis of the existing point cloud completion model, a self-attention transformation network is applied to point cloud completion, so that the parameter quantity and the operation quantity of the model are greatly reduced, and the model processing speed is improved; an adaptive network is added to improve the subsequent characteristic processing capability of the self-attention transformation network, and the flexibility, the adaptability and the effectiveness of data are improved; high accuracy is ensured while reducing the amount of model calculation.
Furthermore, in the process of robot vision grabbing, the incomplete object surface obtained by scanning through the depth camera can be completed quickly through the method and the device, so that the complete object surface is obtained, and grabbing is completed. In the process of scanning surrounding vehicles through the depth camera in unmanned driving, the obtained incomplete vehicle surface characteristics can be quickly subjected to surface completion.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (7)

1. A point cloud complementing device based on an adaptive attention-adaptive transformation network is characterized by comprising:
the point cloud sampling module is used for carrying out point cloud sampling twice to obtain two layers of fused point cloud spatial information characteristics;
the adaptive self-attention transformation module is used for carrying out adaptive feature fusion according to the point cloud spatial information features;
the complementing module is used for complementing cloud points according to the output result of the adaptive self-attention transformation module;
wherein the point cloud refers to a collection of target surface characteristics that represent an apparent surface of an object; the target surface characteristic is an object surface characteristic obtained by scanning through a depth camera in the process of robot vision grabbing, or a vehicle surface characteristic obtained in the process of scanning surrounding vehicles through the depth camera in an unmanned manner;
the adaptive attention transformation module comprises:
the first calculation unit is used for calculating the characteristic information through a layer of convolution to obtain a K vector according to the point cloud space information characteristics extracted by the point cloud sampling module, reducing the vector dimension to one fourth, calculating the characteristic information through the same layer of convolution to obtain a Q vector, and calculating a V vector through a layer of convolution without changing the dimension;
the second calculation unit is used for multiplying the Q vector and the K vector to obtain a scoring vector grouped for each point, and then using a soft maximum layer to normalize the scoring vector into an attention map, so that the sum of the scoring vector and the attention map is 1; dividing the attention diagram by the sum of the attention diagram, carrying out normalization again, and carrying out weighted summation on the attention diagram and the V vector to obtain an attention diagram of the fusion feature; finally, subtracting the characteristics which are not fused at the beginning from the fused characteristics at the end to obtain relative spatial characteristic information;
the third calculation unit is used for pre-designing n different convolution kernels to be placed in a convolution pool, firstly placing the relative spatial feature information obtained by the previous calculation unit into an independent scoring network, changing the feature map into n dimensions through one layer of convolution calculation, generating weights for the n convolution kernels, and then normalizing the weights by using a soft maximum layer to enable the sum to be 1; multiplying the weight of each convolution kernel obtained from the scoring network by the convolution kernel, then adding the weight and the convolution kernel into a complete convolution kernel, and performing the last convolution calculation on the relative spatial feature information obtained by the last calculation unit and the convolution kernel to obtain feature-fused spatial feature information; and adding the point cloud spatial information features input from the point cloud sampling module at first and the spatial feature information after feature fusion to obtain complete spatial feature information for point cloud generation.
2. The adaptive attention transformation network-based point cloud complementing device of claim 1, wherein the point cloud sampling module comprises:
the acquisition unit is used for acquiring relative point cloud coordinates formed by the object with a central origin;
the mapping unit is used for mapping the N point cloud coordinates to a high dimension through convolution, firstly calculating the xyz coordinates of the 3-dimensional object into 64-dimensional vectors through convolution calculation, expressing each point by using one 64-dimensional vector, and then calculating the 64-dimensional vectors into 128-dimensional vectors through one convolution calculation to express so as to obtain global point cloud space position information;
the clustering unit is used for acquiring the distance of all other points corresponding to each point through sampling of the farthest point, sequencing the distances of all the points corresponding to each point, then selecting the features of k points closest to each point through a k adjacent point algorithm, and packaging the features into a k multiplied by 128 matrix to form a group; 1024 groups are obtained through the clustering unit for the first time, and 512 groups are obtained through the clustering unit for the second time;
and the extraction unit is used for performing feature fusion by using twice identical convolution calculation to obtain key information in order to change each k multiplied by 128 matrix into a 128-dimensional vector after each grouping, extracting the maximum point cloud space information feature of the k multiplied by 128 matrix on k rows by using the maximum pooling layer, and finally forming a 1 multiplied by 128 vector.
3. The adaptive attention transformation network-based point cloud completion apparatus of claim 1, wherein the completion module utilizes each layer of point cloud fusion features to generate three segments of completion point clouds by an MLP structure; respectively taking the sampling features of the first point cloud sampling layer and the second point cloud sampling layer as the input of two layers of adaptive self-attention transformation modules, and respectively taking the final output dimensions of N/4 x 512 and N/2 x 256; outputting the sampling characteristics of the first point cloud sampling layer through two layers of adaptive self-attention transformation modules for point cloud generation, and reducing 256 dimensions into 3-dimensional xyz space point cloud coordinates by using convolution calculation to obtain integral object point cloud completion; respectively obtaining two outputs by the sampling features of a second point cloud sampling layer through two layers of adaptive self-attention transformation modules, wherein the two outputs are used for generating point clouds, and 512-dimensional space point cloud coordinates are reduced into 3-dimensional xyz space point cloud coordinates by using convolution calculation to obtain point cloud completion of local details of an object; and generating point cloud numbers of N/4 x 3, N/4 x 3 and N/2 x 3 respectively by the three-layer MLP structure, and finally fusing and outputting N x 3 point clouds.
4. A point cloud completion method based on a self-attention transformation network is characterized by comprising the following steps:
s1, performing two-time point cloud sampling to obtain two-layer fused point cloud spatial information characteristics;
step S2, performing adaptive feature fusion according to the point cloud spatial information features, and respectively calculating the K vector, the Q vector and the V vector of each grouping spatial feature by using three-layer convolution through embedding the point cloud spatial information features; multiplying the Q vector and the K vector to obtain a scoring vector for grouping each point, multiplying the scoring vector and the V vector to obtain fusion characteristics, and performing final point cloud convolution operation by using a characteristic difference, wherein the adaptive network is applied to a final convolution layer, and the method specifically comprises the following steps of: n different convolution kernels are designed in advance and placed in a convolution pool, firstly, grouping space features are placed in a scoring network, the features are mapped into n dimensions through a layer of convolution, and weights for the n convolution kernels are generated; multiplying the weight of each convolution kernel obtained from the scoring network by the convolution kernel, and finally adding the weights to form a complete convolution kernel for convolution calculation;
s3, according to the result of the adaptability characteristic fusion, cloud point completion is achieved;
wherein the point cloud refers to a collection of target surface characteristics that represent an apparent surface of an object; the target surface characteristics are object surface characteristics obtained by scanning through a depth camera in the process of robot vision grabbing, or vehicle surface characteristics obtained in the process of scanning surrounding vehicles through the depth camera in an unmanned mode.
5. The point cloud completion method based on the self-attention transform network as claimed in claim 4, wherein in step 1, the cloud coordinates of N points are mapped to a high dimension by convolution to obtain the space position information of the global point cloud, then the distance of all other points corresponding to each point is obtained by sampling the farthest point, and then k points closest to each point are clustered by k adjacent point algorithms to form a group; dividing the data into 1024 groups, sampling and grouping again, and dividing into 512 groups; and finally, extracting key useful point cloud spatial information features by adopting a maximum pooling layer.
6. The self-attention transformation network-based point cloud completion method of claim 5, wherein in step 3, three segments of completion point clouds are generated by an MLP structure by using each layer of point cloud fusion features; respectively taking the sampling features of the first point cloud sampling layer and the second point cloud sampling layer as the input of two layers of adaptive self-attention transformation modules, and respectively taking the final output dimensions of N/4 x 512 and N/2 x 256; outputting the sampling characteristics of the first point cloud sampling layer through two layers of adaptive self-attention transformation modules for point cloud generation, and reducing 256 dimensions into 3-dimensional xyz space point cloud coordinates by using convolution calculation to obtain integral object point cloud completion; and respectively obtaining two outputs by the sampling features of the second point cloud sampling layer through two layers of adaptive self-attention transformation modules, wherein the two outputs are used for generating point clouds, and 512-dimensional points are reduced into 3-dimensional xyz space point cloud coordinates by using convolution calculation to obtain point cloud completion of local details of the object.
7. The method according to claim 6, wherein in step 3, the three-layered MLP structure generates N/4 x 3, N/4 x 3 and N/2 x 3 point clouds, and finally fuses and outputs N x 3 point clouds.
CN202110890669.2A 2021-08-04 2021-08-04 Point cloud completion device and method based on adaptive self-attention transformation network Active CN113486988B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110890669.2A CN113486988B (en) 2021-08-04 2021-08-04 Point cloud completion device and method based on adaptive self-attention transformation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110890669.2A CN113486988B (en) 2021-08-04 2021-08-04 Point cloud completion device and method based on adaptive self-attention transformation network

Publications (2)

Publication Number Publication Date
CN113486988A CN113486988A (en) 2021-10-08
CN113486988B true CN113486988B (en) 2022-02-15

Family

ID=77945621

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110890669.2A Active CN113486988B (en) 2021-08-04 2021-08-04 Point cloud completion device and method based on adaptive self-attention transformation network

Country Status (1)

Country Link
CN (1) CN113486988B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110200710A (en) * 2019-04-17 2019-09-06 广东工业大学 A kind of oral restoration method based on three-dimensional imaging and Real-time modeling set
CN111950467A (en) * 2020-08-14 2020-11-17 清华大学 Fusion network lane line detection method based on attention mechanism and terminal equipment
CN112614071A (en) * 2020-12-29 2021-04-06 清华大学 Self-attention-based diverse point cloud completion method and device
CN112785526A (en) * 2021-01-28 2021-05-11 南京大学 Three-dimensional point cloud repairing method for graphic processing
CN112966696A (en) * 2021-02-05 2021-06-15 中国科学院深圳先进技术研究院 Method, device and equipment for processing three-dimensional point cloud and storage medium
CN113205466A (en) * 2021-05-10 2021-08-03 南京航空航天大学 Incomplete point cloud completion method based on hidden space topological structure constraint

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083705B (en) * 2019-05-06 2021-11-02 电子科技大学 Multi-hop attention depth model, method, storage medium and terminal for target emotion classification
CN111489358B (en) * 2020-03-18 2022-06-14 华中科技大学 Three-dimensional point cloud semantic segmentation method based on deep learning
CN113160068B (en) * 2021-02-23 2022-08-05 清华大学 Point cloud completion method and system based on image
CN113052955B (en) * 2021-03-19 2023-06-30 西安电子科技大学 Point cloud completion method, system and application
CN112927359B (en) * 2021-03-22 2024-01-30 南京大学 Three-dimensional point cloud completion method based on deep learning and voxels
CN113052835B (en) * 2021-04-20 2024-02-27 江苏迅捷装具科技有限公司 Medicine box detection method and system based on three-dimensional point cloud and image data fusion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110200710A (en) * 2019-04-17 2019-09-06 广东工业大学 A kind of oral restoration method based on three-dimensional imaging and Real-time modeling set
CN111950467A (en) * 2020-08-14 2020-11-17 清华大学 Fusion network lane line detection method based on attention mechanism and terminal equipment
CN112614071A (en) * 2020-12-29 2021-04-06 清华大学 Self-attention-based diverse point cloud completion method and device
CN112785526A (en) * 2021-01-28 2021-05-11 南京大学 Three-dimensional point cloud repairing method for graphic processing
CN112966696A (en) * 2021-02-05 2021-06-15 中国科学院深圳先进技术研究院 Method, device and equipment for processing three-dimensional point cloud and storage medium
CN113205466A (en) * 2021-05-10 2021-08-03 南京航空航天大学 Incomplete point cloud completion method based on hidden space topological structure constraint

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CVPR 2021 Oral | VRCNet:变分关联点云补全网络;潘亮;《商汤学术》;20210608;第1-10页 *
Variational Relational Point Completion Network;LiangPan 等;《arXiv:2104.10154v1》;20210420;第1-15页 *
一种用于DSM局部缺失的深度学习修复算法;官恺等;《测绘科学技术学报》;20200615(第03期);全文 *

Also Published As

Publication number Publication date
CN113486988A (en) 2021-10-08

Similar Documents

Publication Publication Date Title
JP6745328B2 (en) Method and apparatus for recovering point cloud data
CN109685842B (en) Sparse depth densification method based on multi-scale network
CN112927357B (en) 3D object reconstruction method based on dynamic graph network
CN110674829B (en) Three-dimensional target detection method based on graph convolution attention network
CN111079685B (en) 3D target detection method
CN113223091B (en) Three-dimensional target detection method, three-dimensional target capture device and electronic equipment
WO2022017131A1 (en) Point cloud data processing method and device, and intelligent driving control method and device
CN113052109A (en) 3D target detection system and 3D target detection method thereof
Badías et al. An augmented reality platform for interactive aerodynamic design and analysis
CN111145253A (en) Efficient object 6D attitude estimation algorithm
JP2019159940A (en) Point group feature extraction device, point group feature extraction method, and program
CN107972027A (en) The localization method and device of robot, robot
CN114067075A (en) Point cloud completion method and device based on generation of countermeasure network
CN114757904A (en) Surface defect detection method based on AI deep learning algorithm
CN115641322A (en) Robot grabbing method and system based on 6D pose estimation
CN115147545A (en) Scene three-dimensional intelligent reconstruction system and method based on BIM and deep learning
CN113724387A (en) Laser and camera fused map construction method
CN114155414A (en) Novel unmanned-driving-oriented feature layer data fusion method and system and target detection method
CN114494594A (en) Astronaut operating equipment state identification method based on deep learning
CN113486988B (en) Point cloud completion device and method based on adaptive self-attention transformation network
TWI731604B (en) Three-dimensional point cloud data processing method
Yao et al. Research of camera calibration based on genetic algorithm BP neural network
CN116152579A (en) Point cloud 3D target detection method and model based on discrete Transformer
WO2022017129A1 (en) Target object detection method and apparatus, electronic device, and storage medium
CN115457539A (en) 3D target detection algorithm based on multiple sensors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant