CN114358246A - Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene - Google Patents
Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene Download PDFInfo
- Publication number
- CN114358246A CN114358246A CN202111618088.XA CN202111618088A CN114358246A CN 114358246 A CN114358246 A CN 114358246A CN 202111618088 A CN202111618088 A CN 202111618088A CN 114358246 A CN114358246 A CN 114358246A
- Authority
- CN
- China
- Prior art keywords
- module
- point cloud
- attention
- dimensional
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 14
- 230000007246 mechanism Effects 0.000 title claims abstract description 13
- 238000011176 pooling Methods 0.000 claims abstract description 14
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000007547 defect Effects 0.000 abstract description 6
- 238000000605 extraction Methods 0.000 abstract description 4
- 238000006116 polymerization reaction Methods 0.000 abstract 1
- 238000000034 method Methods 0.000 description 26
- 230000011218 segmentation Effects 0.000 description 11
- 238000013135 deep learning Methods 0.000 description 8
- 238000013527 convolutional neural network Methods 0.000 description 6
- 238000011160 research Methods 0.000 description 5
- 230000002776 aggregation Effects 0.000 description 4
- 238000004220 aggregation Methods 0.000 description 4
- 239000010410 layer Substances 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003672 processing method Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 239000002356 single layer Substances 0.000 description 2
- 208000025174 PANDAS Diseases 0.000 description 1
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 1
- 240000000220 Panda oleosa Species 0.000 description 1
- 235000016496 Panda oleosa Nutrition 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a graph convolution neural network module of an attention mechanism of a three-dimensional point cloud scene, which comprises the following components: an attention map coding module, namely an AGEM module and an attention pooling module, namely an AP module; the defects of poor local feature extraction capability of the existing model and poor feature polymerization capability of the DGCNN model are overcome.
Description
Technical Field
The invention relates to the field of point cloud data, in particular to a graph convolution neural network module of an attention mechanism of a three-dimensional point cloud scene.
Background
The point cloud is a collection of some discrete points in a three-dimensional space, and compared with a common remote sensing image, the point cloud data has more spatial information. Therefore, the method has important value for tasks such as surface monitoring and the like, and the research of the three-dimensional point cloud data is widely applied to the fields such as society and the like. The method mainly comprises road segmentation, 3D city modeling, automatic driving, face recognition, forest monitoring and the like. The research of the three-dimensional point cloud data mainly focuses on three-dimensional information and depth features carried by the excavation point cloud data. The scientific research history of the three-dimensional point cloud data is long, and the hard science and the social progress are promoted widely from artificial geometry to deep learning. Three-dimensional point cloud data has more spatial information than traditional images, which presents more challenges and opportunities in the point cloud field. Meanwhile, due to the successful application of the Convolutional Neural Network (CNN) in the aspects of image classification, target detection, semantic segmentation and the like, a deep learning method is developed. Meanwhile, inspired by the excellent results, the point cloud research is also turned to a flexible neural network structure from the traditional machine learning, and the point cloud data is analyzed from the perspective of deep learning to be applied to actual industry and commerce.
Hang Su et al process three-dimensional point cloud data by projecting the point cloud into two dimensions. And the existing two-dimensional image processing method is utilized to carry out tasks such as classification and segmentation on the data. Charles et al, derived from mathematical theory, first propose to process point cloud data using symmetric functions to satisfy the invariance of point cloud data. Daniel Maturana et al apply a deep learning method such as CNN by voxelizing the point cloud. Generally, early three-dimensional point cloud data research mainly utilizes theoretical knowledge of various disciplines such as geometry to analyze shallow information. The most advanced methods always imply deep learning represented by Convolutional Neural Networks (CNN). In addition, the method achieves huge achievement and outstanding expression in the aspects of semantic segmentation, classification, target detection and the like.
Undeniably, the role of CNN in deep learning is likely to be de facto standard. However, its parameters grow exponentially with the increase of the convolutional layer, and its size increases with the increase of the computing power, and the projection and the voxelization usually bring huge memory occupation and computation consumption. In addition, due to the persistence of multiplication and addition operations, computational consumption is a bottleneck for industrial applications and cannot meet the real-time requirements of the industry.
The second prior art is:
SEGCloud processes by dividing the overall point cloud data into several small point clouds and applying trilinear interpolation and conditional random fields. Charles et al propose a method of gradually enlarging the receptive field by improving the PointNet network, and also improve the problem of slightly larger calculation amount. Recently, with the benefit of the successful expansion of maps and other non-linear structures in the field of deep learning, the atlas neural network has gained the most advanced performance in computer vision and has attracted the attention of many researchers. Inspired by this, AdaptConv uses a dynamic convolution kernel to make the convolution operation more flexible. The 3D-GCN also designs a learnable convolution kernel to acquire local features and shows good learning ability. DGCNN provides a convolution method named as EdgeConv, which can dynamically calculate the graph structure of each network layer and aggregate the characteristics of the central node in the local graph and the corresponding edge characteristics, wherein the graph structure is obtained by a K-NN method, and the local characteristics are well extracted by the method. With the success of attention in the field of natural language processing, more and more people have come to apply it in the field of computer vision. The GAPNet, GACNet and LAE-Conv are all designed with attention modules to obtain point cloud features. These graph structure-based point cloud processing methods and attention mechanism-based point cloud processing methods use a maximum pooling approach to aggregate the features of local feature maps.
The second prior art has the defects
The above methods all use a simple max pooling (max pooling) strategy to aggregate local feature information. Therefore, it results in many disadvantages, such as that important information is filtered and that valid and invalid features cannot be clearly distinguished, etc. The maximum value pooling selects the largest data in all the features directly, but other data does not contribute to feature extraction in the calculation process, so that much useful information is usually discarded by the maximum value pooling strategy.
Aiming at the defect of poor local feature extraction capability of the existing model, the invention provides a graph convolution neural network module based on an attention mechanism, which uses a K-NN algorithm to obtain K adjacent points of a central point and sequentially constructs a local feature graph structure. The local feature graph structure can extract the local topological structure of the point cloud data so as to better represent local features, thereby well solving the defects of the existing model.
Aiming at the defect of poor characteristic aggregation capability of a DGCNN model, the invention provides a graph convolution neural network based on an attention pooling strategy. The method comprises the steps of obtaining a plurality of adjacent points of each central point through a K-NN algorithm, calculating different attention weights, and calculating and extracting the local most important features of current input data through the weights.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a graph convolution neural network module of an attention mechanism of a three-dimensional point cloud scene, and solves the defects of poor local feature extraction capability and poor feature aggregation capability of a DGCNN model of the conventional model.
The technical scheme provided by the invention is as follows:
a graph convolution neural network module of an attention mechanism of a three-dimensional point cloud scene comprises: an attention map encoding module, namely an AGEM module, and an attention pooling module, namely an AP module.
Preferably, the AGEM module comprises the following steps:
s1: firstly, acquiring K nearest points of a central point by using a K-NN algorithm through a given K value, and forming a local point cloud set;
s2: then, the input point cloud characteristics are promoted to be the same as the k adjacent point set through Repeat operation;
s3: coding the k adjacent point set and the original point set to obtain high-dimensional characteristics;
s4: and splicing the acquired high-dimensional features, and transmitting the high-dimensional features serving as input features into the AP module.
Preferably, the AP module includes: attention weight calculation, attention weight mask, MPL module.
The graph convolution neural network module of the attention mechanism of the three-dimensional point cloud scene has the following beneficial effects:
1. the invention realizes a high-efficiency point cloud classification and segmentation method, which exceeds the prior method, thereby having certain commercial value.
2. For the semantic segmentation task, the size of the model is only 2.03M, and the method can be well suitable for industrial requirements.
3. The method can be well applied to the field of point cloud, such as land change analysis, city modeling, road segmentation and the like.
4. The influence of forest transition and terrain on forest dynamic change is monitored, and meanwhile, forest tree species can be classified.
Drawings
FIG. 1 is a schematic diagram of a convolution module for AGM attention.
FIG. 2 illustrates an AP attention pooling module of the present invention.
FIG. 3 is a data flow diagram of the present invention.
FIG. 4 is a visualization of the results of the method of the invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
The invention mainly comprises Numpy, Pandas, Tensor and the like through the existing deep learning frame Pythrch and a corresponding programming library. The Pytrch mainly uses a deep learning model, and comprises a linear module, a convolution module, a parameter penalty module and the like.
The specific scheme is realized according to the following principle:
we convert the segmentation into a classification task in the proposed method, performing pixel-level classification instead of patch segmentation, as shown in fig. 1. The main approach is AGM, which tries a convolution module. The method comprises an AGEM (attention map coding) module and an AP, attention pooling module. For the AGEM module, it meets the attention calculation needs of the AP module by making characteristic changes to the data. Specifically, firstly, using a K-NN algorithm, K nearest points of a central point are obtained through a given K value, and a local point cloud set is formed. The input point cloud features are then lifted to the same size as the k-neighbor set by the Repeat operation. Next, the k neighboring point set and the original point set are encoded to obtain high-dimensional features, such as relative distance, relative coordinates, and the like. And finally, splicing the acquired high-dimensional features, and transmitting the high-dimensional features serving as input features into the AP module.
For the AP module, it is composed of an attention weight calculation, an attention weight mask, and an MPL module, respectively. As shown in fig. 2
The data flow diagram 3 of this patent is as follows;
finally, the present invention has performed extensive experiments on three widely adopted public data sets: a ModelNet40 dataset for target classification, a shapenet part dataset for part segmentation, and an S3DIS dataset for semantic segmentation. AGNet is comprehensively superior to the most advanced method, and in a ModelNet40 data set, compared with PointNet, the accuracy is improved by 4.2%, the ECC is improved by 6.0%, the VoxNet is improved by 7.5%, and the 3DShapeNet is improved by 8.7%.
The results of the patented method are visualized as shown in fig. 4:
the architecture of the network is implemented as an important protection point, as shown in fig. 1, the network is composed of a single-layer MLP transform, an AGEM (attention-seeking convolutional coding module) and an AP (attention-pooling module), and the specific protection technology is as follows:
single layer MLP transform
Because the dimensions of input clean data are different, and the channels have great information redundancy, a part of dimension information is adjusted by one layer of MLP and simultaneously the input of a downstream model is matched
Local feature aggregation module
Obtaining data after a certain degree of change, obtaining K nearest points by using a K-NN algorithm for point cloud characteristics, obtaining high-dimensional characteristics such as relative distance by using semantic information of the searched adjacent points, and sequentially constructing local characteristic graphs to finish aggregation of local characteristic information.
Attention pooling module
And screening and pooling the obtained aggregated features in channel dimensions, calculating the weights of different features by using an attention mechanism, and acquiring important information in the features.
Output of
The most common linear full-link layer is used for inputting the final classification result, the part is a public technology, and the key protection technology of the method is not included.
Claims (3)
1. A graph convolution neural network module of an attention mechanism of a three-dimensional point cloud scene is characterized by comprising: an attention map encoding module, namely an AGEM module, and an attention pooling module, namely an AP module.
2. The charting neural network module of the attention mechanism of the three-dimensional point cloud scene of claim 1, wherein the AGEM module comprises the steps of:
s1: firstly, acquiring K nearest points of a central point by using a K-NN algorithm through a given K value, and forming a local point cloud set;
s2: then, the input point cloud characteristics are promoted to be the same as the k adjacent point set through Repeat operation;
s3: coding the k adjacent point set and the original point set to obtain high-dimensional characteristics;
s4: and splicing the acquired high-dimensional features, and transmitting the high-dimensional features serving as input features into the AP module.
3. The charting neural network module for an attention mechanism of a three-dimensional point cloud scene of claim 1, wherein the AP module comprises: attention weight calculation, attention weight mask, MPL module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111618088.XA CN114358246A (en) | 2021-12-27 | 2021-12-27 | Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111618088.XA CN114358246A (en) | 2021-12-27 | 2021-12-27 | Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114358246A true CN114358246A (en) | 2022-04-15 |
Family
ID=81103709
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111618088.XA Pending CN114358246A (en) | 2021-12-27 | 2021-12-27 | Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114358246A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116129118A (en) * | 2023-01-17 | 2023-05-16 | 华北水利水电大学 | Urban scene laser LiDAR point cloud semantic segmentation method based on graph convolution |
CN116403058A (en) * | 2023-06-09 | 2023-07-07 | 昆明理工大学 | Remote sensing cross-scene multispectral laser radar point cloud classification method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110969589A (en) * | 2019-12-03 | 2020-04-07 | 重庆大学 | Dynamic scene fuzzy image blind restoration method based on multi-stream attention countermeasure network |
CN112257597A (en) * | 2020-10-22 | 2021-01-22 | 中国人民解放军战略支援部队信息工程大学 | Semantic segmentation method of point cloud data |
CN113554654A (en) * | 2021-06-07 | 2021-10-26 | 之江实验室 | Point cloud feature extraction model based on graph neural network and classification and segmentation method |
-
2021
- 2021-12-27 CN CN202111618088.XA patent/CN114358246A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110969589A (en) * | 2019-12-03 | 2020-04-07 | 重庆大学 | Dynamic scene fuzzy image blind restoration method based on multi-stream attention countermeasure network |
CN112257597A (en) * | 2020-10-22 | 2021-01-22 | 中国人民解放军战略支援部队信息工程大学 | Semantic segmentation method of point cloud data |
CN113554654A (en) * | 2021-06-07 | 2021-10-26 | 之江实验室 | Point cloud feature extraction model based on graph neural network and classification and segmentation method |
Non-Patent Citations (3)
Title |
---|
WEIPENG JING,ET AL.: "AGNet: An Attention-Based Graph Network for Point Cloud Classification and Segmentation", REMOTE SENSING, 21 February 2022 (2022-02-21) * |
XIN WEN,ET AL.: "CF-SIS: Semantic-Instance Segmentation of 3D Point Clouds by Context Fusion with Self-Attention", ACM, 12 October 2020 (2020-10-12) * |
ZHUYANG XIE,ET AL.: "Point Clouds Learning with Attention-based Graph Convolution Networks", ARXIV, 31 May 2019 (2019-05-31) * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116129118A (en) * | 2023-01-17 | 2023-05-16 | 华北水利水电大学 | Urban scene laser LiDAR point cloud semantic segmentation method based on graph convolution |
CN116129118B (en) * | 2023-01-17 | 2023-10-20 | 华北水利水电大学 | Urban scene laser LiDAR point cloud semantic segmentation method based on graph convolution |
CN116403058A (en) * | 2023-06-09 | 2023-07-07 | 昆明理工大学 | Remote sensing cross-scene multispectral laser radar point cloud classification method |
CN116403058B (en) * | 2023-06-09 | 2023-09-12 | 昆明理工大学 | Remote sensing cross-scene multispectral laser radar point cloud classification method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111242208B (en) | Point cloud classification method, segmentation method and related equipment | |
WO2024077812A1 (en) | Single building three-dimensional reconstruction method based on point cloud semantic segmentation and structure fitting | |
Racah et al. | Extremeweather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events | |
CN112184752A (en) | Video target tracking method based on pyramid convolution | |
Wang et al. | Multifocus image fusion using convolutional neural networks in the discrete wavelet transform domain | |
CN112991350B (en) | RGB-T image semantic segmentation method based on modal difference reduction | |
CN114092697B (en) | Building facade semantic segmentation method with attention fused with global and local depth features | |
CN114358246A (en) | Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene | |
CN112733693B (en) | Multi-scale residual error road extraction method for global perception high-resolution remote sensing image | |
CN106780639B (en) | Hash coding method based on significance characteristic sparse embedding and extreme learning machine | |
CN110991444A (en) | Complex scene-oriented license plate recognition method and device | |
CN116740439A (en) | Crowd counting method based on trans-scale pyramid convertors | |
CN111612046B (en) | Feature pyramid graph convolution neural network and application thereof in 3D point cloud classification | |
CN109242854A (en) | A kind of image significance detection method based on FLIC super-pixel segmentation | |
CN114299305A (en) | Salient object detection algorithm for aggregating dense and attention multi-scale features | |
CN117765258A (en) | Large-scale point cloud semantic segmentation method based on density self-adaption and attention mechanism | |
CN117475228A (en) | Three-dimensional point cloud classification and segmentation method based on double-domain feature learning | |
CN116386042A (en) | Point cloud semantic segmentation model based on three-dimensional pooling spatial attention mechanism | |
CN116935249A (en) | Small target detection method for three-dimensional feature enhancement under unmanned airport scene | |
CN113903016B (en) | Bifurcation point detection method, bifurcation point detection device, computer equipment and storage medium | |
CN116977265A (en) | Training method and device for defect detection model, computer equipment and storage medium | |
CN104408158A (en) | Viewpoint tracking method based on geometrical reconstruction and semantic integration | |
CN115631412A (en) | Remote sensing image building extraction method based on coordinate attention and data correlation upsampling | |
CN114677508A (en) | Point cloud instance semantic segmentation method based on dynamic filtering and point-by-point correlation | |
CN112365456A (en) | Transformer substation equipment classification method based on three-dimensional point cloud data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |