CN114358246A - Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene - Google Patents

Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene Download PDF

Info

Publication number
CN114358246A
CN114358246A CN202111618088.XA CN202111618088A CN114358246A CN 114358246 A CN114358246 A CN 114358246A CN 202111618088 A CN202111618088 A CN 202111618088A CN 114358246 A CN114358246 A CN 114358246A
Authority
CN
China
Prior art keywords
module
point cloud
attention
dimensional
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111618088.XA
Other languages
Chinese (zh)
Inventor
景维鹏
张文钧
李林辉
陈广胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeast Forestry University
Original Assignee
Northeast Forestry University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeast Forestry University filed Critical Northeast Forestry University
Priority to CN202111618088.XA priority Critical patent/CN114358246A/en
Publication of CN114358246A publication Critical patent/CN114358246A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a graph convolution neural network module of an attention mechanism of a three-dimensional point cloud scene, which comprises the following components: an attention map coding module, namely an AGEM module and an attention pooling module, namely an AP module; the defects of poor local feature extraction capability of the existing model and poor feature polymerization capability of the DGCNN model are overcome.

Description

Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene
Technical Field
The invention relates to the field of point cloud data, in particular to a graph convolution neural network module of an attention mechanism of a three-dimensional point cloud scene.
Background
The point cloud is a collection of some discrete points in a three-dimensional space, and compared with a common remote sensing image, the point cloud data has more spatial information. Therefore, the method has important value for tasks such as surface monitoring and the like, and the research of the three-dimensional point cloud data is widely applied to the fields such as society and the like. The method mainly comprises road segmentation, 3D city modeling, automatic driving, face recognition, forest monitoring and the like. The research of the three-dimensional point cloud data mainly focuses on three-dimensional information and depth features carried by the excavation point cloud data. The scientific research history of the three-dimensional point cloud data is long, and the hard science and the social progress are promoted widely from artificial geometry to deep learning. Three-dimensional point cloud data has more spatial information than traditional images, which presents more challenges and opportunities in the point cloud field. Meanwhile, due to the successful application of the Convolutional Neural Network (CNN) in the aspects of image classification, target detection, semantic segmentation and the like, a deep learning method is developed. Meanwhile, inspired by the excellent results, the point cloud research is also turned to a flexible neural network structure from the traditional machine learning, and the point cloud data is analyzed from the perspective of deep learning to be applied to actual industry and commerce.
Prior art 1
Hang Su et al process three-dimensional point cloud data by projecting the point cloud into two dimensions. And the existing two-dimensional image processing method is utilized to carry out tasks such as classification and segmentation on the data. Charles et al, derived from mathematical theory, first propose to process point cloud data using symmetric functions to satisfy the invariance of point cloud data. Daniel Maturana et al apply a deep learning method such as CNN by voxelizing the point cloud. Generally, early three-dimensional point cloud data research mainly utilizes theoretical knowledge of various disciplines such as geometry to analyze shallow information. The most advanced methods always imply deep learning represented by Convolutional Neural Networks (CNN). In addition, the method achieves huge achievement and outstanding expression in the aspects of semantic segmentation, classification, target detection and the like.
Undeniably, the role of CNN in deep learning is likely to be de facto standard. However, its parameters grow exponentially with the increase of the convolutional layer, and its size increases with the increase of the computing power, and the projection and the voxelization usually bring huge memory occupation and computation consumption. In addition, due to the persistence of multiplication and addition operations, computational consumption is a bottleneck for industrial applications and cannot meet the real-time requirements of the industry.
The second prior art is:
SEGCloud processes by dividing the overall point cloud data into several small point clouds and applying trilinear interpolation and conditional random fields. Charles et al propose a method of gradually enlarging the receptive field by improving the PointNet network, and also improve the problem of slightly larger calculation amount. Recently, with the benefit of the successful expansion of maps and other non-linear structures in the field of deep learning, the atlas neural network has gained the most advanced performance in computer vision and has attracted the attention of many researchers. Inspired by this, AdaptConv uses a dynamic convolution kernel to make the convolution operation more flexible. The 3D-GCN also designs a learnable convolution kernel to acquire local features and shows good learning ability. DGCNN provides a convolution method named as EdgeConv, which can dynamically calculate the graph structure of each network layer and aggregate the characteristics of the central node in the local graph and the corresponding edge characteristics, wherein the graph structure is obtained by a K-NN method, and the local characteristics are well extracted by the method. With the success of attention in the field of natural language processing, more and more people have come to apply it in the field of computer vision. The GAPNet, GACNet and LAE-Conv are all designed with attention modules to obtain point cloud features. These graph structure-based point cloud processing methods and attention mechanism-based point cloud processing methods use a maximum pooling approach to aggregate the features of local feature maps.
The second prior art has the defects
The above methods all use a simple max pooling (max pooling) strategy to aggregate local feature information. Therefore, it results in many disadvantages, such as that important information is filtered and that valid and invalid features cannot be clearly distinguished, etc. The maximum value pooling selects the largest data in all the features directly, but other data does not contribute to feature extraction in the calculation process, so that much useful information is usually discarded by the maximum value pooling strategy.
Aiming at the defect of poor local feature extraction capability of the existing model, the invention provides a graph convolution neural network module based on an attention mechanism, which uses a K-NN algorithm to obtain K adjacent points of a central point and sequentially constructs a local feature graph structure. The local feature graph structure can extract the local topological structure of the point cloud data so as to better represent local features, thereby well solving the defects of the existing model.
Aiming at the defect of poor characteristic aggregation capability of a DGCNN model, the invention provides a graph convolution neural network based on an attention pooling strategy. The method comprises the steps of obtaining a plurality of adjacent points of each central point through a K-NN algorithm, calculating different attention weights, and calculating and extracting the local most important features of current input data through the weights.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a graph convolution neural network module of an attention mechanism of a three-dimensional point cloud scene, and solves the defects of poor local feature extraction capability and poor feature aggregation capability of a DGCNN model of the conventional model.
The technical scheme provided by the invention is as follows:
a graph convolution neural network module of an attention mechanism of a three-dimensional point cloud scene comprises: an attention map encoding module, namely an AGEM module, and an attention pooling module, namely an AP module.
Preferably, the AGEM module comprises the following steps:
s1: firstly, acquiring K nearest points of a central point by using a K-NN algorithm through a given K value, and forming a local point cloud set;
s2: then, the input point cloud characteristics are promoted to be the same as the k adjacent point set through Repeat operation;
s3: coding the k adjacent point set and the original point set to obtain high-dimensional characteristics;
s4: and splicing the acquired high-dimensional features, and transmitting the high-dimensional features serving as input features into the AP module.
Preferably, the AP module includes: attention weight calculation, attention weight mask, MPL module.
The graph convolution neural network module of the attention mechanism of the three-dimensional point cloud scene has the following beneficial effects:
1. the invention realizes a high-efficiency point cloud classification and segmentation method, which exceeds the prior method, thereby having certain commercial value.
2. For the semantic segmentation task, the size of the model is only 2.03M, and the method can be well suitable for industrial requirements.
3. The method can be well applied to the field of point cloud, such as land change analysis, city modeling, road segmentation and the like.
4. The influence of forest transition and terrain on forest dynamic change is monitored, and meanwhile, forest tree species can be classified.
Drawings
FIG. 1 is a schematic diagram of a convolution module for AGM attention.
FIG. 2 illustrates an AP attention pooling module of the present invention.
FIG. 3 is a data flow diagram of the present invention.
FIG. 4 is a visualization of the results of the method of the invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
The invention mainly comprises Numpy, Pandas, Tensor and the like through the existing deep learning frame Pythrch and a corresponding programming library. The Pytrch mainly uses a deep learning model, and comprises a linear module, a convolution module, a parameter penalty module and the like.
The specific scheme is realized according to the following principle:
we convert the segmentation into a classification task in the proposed method, performing pixel-level classification instead of patch segmentation, as shown in fig. 1. The main approach is AGM, which tries a convolution module. The method comprises an AGEM (attention map coding) module and an AP, attention pooling module. For the AGEM module, it meets the attention calculation needs of the AP module by making characteristic changes to the data. Specifically, firstly, using a K-NN algorithm, K nearest points of a central point are obtained through a given K value, and a local point cloud set is formed. The input point cloud features are then lifted to the same size as the k-neighbor set by the Repeat operation. Next, the k neighboring point set and the original point set are encoded to obtain high-dimensional features, such as relative distance, relative coordinates, and the like. And finally, splicing the acquired high-dimensional features, and transmitting the high-dimensional features serving as input features into the AP module.
For the AP module, it is composed of an attention weight calculation, an attention weight mask, and an MPL module, respectively. As shown in fig. 2
The data flow diagram 3 of this patent is as follows;
finally, the present invention has performed extensive experiments on three widely adopted public data sets: a ModelNet40 dataset for target classification, a shapenet part dataset for part segmentation, and an S3DIS dataset for semantic segmentation. AGNet is comprehensively superior to the most advanced method, and in a ModelNet40 data set, compared with PointNet, the accuracy is improved by 4.2%, the ECC is improved by 6.0%, the VoxNet is improved by 7.5%, and the 3DShapeNet is improved by 8.7%.
The results of the patented method are visualized as shown in fig. 4:
the architecture of the network is implemented as an important protection point, as shown in fig. 1, the network is composed of a single-layer MLP transform, an AGEM (attention-seeking convolutional coding module) and an AP (attention-pooling module), and the specific protection technology is as follows:
single layer MLP transform
Because the dimensions of input clean data are different, and the channels have great information redundancy, a part of dimension information is adjusted by one layer of MLP and simultaneously the input of a downstream model is matched
Local feature aggregation module
Obtaining data after a certain degree of change, obtaining K nearest points by using a K-NN algorithm for point cloud characteristics, obtaining high-dimensional characteristics such as relative distance by using semantic information of the searched adjacent points, and sequentially constructing local characteristic graphs to finish aggregation of local characteristic information.
Attention pooling module
And screening and pooling the obtained aggregated features in channel dimensions, calculating the weights of different features by using an attention mechanism, and acquiring important information in the features.
Output of
The most common linear full-link layer is used for inputting the final classification result, the part is a public technology, and the key protection technology of the method is not included.

Claims (3)

1. A graph convolution neural network module of an attention mechanism of a three-dimensional point cloud scene is characterized by comprising: an attention map encoding module, namely an AGEM module, and an attention pooling module, namely an AP module.
2. The charting neural network module of the attention mechanism of the three-dimensional point cloud scene of claim 1, wherein the AGEM module comprises the steps of:
s1: firstly, acquiring K nearest points of a central point by using a K-NN algorithm through a given K value, and forming a local point cloud set;
s2: then, the input point cloud characteristics are promoted to be the same as the k adjacent point set through Repeat operation;
s3: coding the k adjacent point set and the original point set to obtain high-dimensional characteristics;
s4: and splicing the acquired high-dimensional features, and transmitting the high-dimensional features serving as input features into the AP module.
3. The charting neural network module for an attention mechanism of a three-dimensional point cloud scene of claim 1, wherein the AP module comprises: attention weight calculation, attention weight mask, MPL module.
CN202111618088.XA 2021-12-27 2021-12-27 Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene Pending CN114358246A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111618088.XA CN114358246A (en) 2021-12-27 2021-12-27 Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111618088.XA CN114358246A (en) 2021-12-27 2021-12-27 Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene

Publications (1)

Publication Number Publication Date
CN114358246A true CN114358246A (en) 2022-04-15

Family

ID=81103709

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111618088.XA Pending CN114358246A (en) 2021-12-27 2021-12-27 Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene

Country Status (1)

Country Link
CN (1) CN114358246A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129118A (en) * 2023-01-17 2023-05-16 华北水利水电大学 Urban scene laser LiDAR point cloud semantic segmentation method based on graph convolution
CN116403058A (en) * 2023-06-09 2023-07-07 昆明理工大学 Remote sensing cross-scene multispectral laser radar point cloud classification method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110969589A (en) * 2019-12-03 2020-04-07 重庆大学 Dynamic scene fuzzy image blind restoration method based on multi-stream attention countermeasure network
CN112257597A (en) * 2020-10-22 2021-01-22 中国人民解放军战略支援部队信息工程大学 Semantic segmentation method of point cloud data
CN113554654A (en) * 2021-06-07 2021-10-26 之江实验室 Point cloud feature extraction model based on graph neural network and classification and segmentation method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110969589A (en) * 2019-12-03 2020-04-07 重庆大学 Dynamic scene fuzzy image blind restoration method based on multi-stream attention countermeasure network
CN112257597A (en) * 2020-10-22 2021-01-22 中国人民解放军战略支援部队信息工程大学 Semantic segmentation method of point cloud data
CN113554654A (en) * 2021-06-07 2021-10-26 之江实验室 Point cloud feature extraction model based on graph neural network and classification and segmentation method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
WEIPENG JING,ET AL.: "AGNet: An Attention-Based Graph Network for Point Cloud Classification and Segmentation", REMOTE SENSING, 21 February 2022 (2022-02-21) *
XIN WEN,ET AL.: "CF-SIS: Semantic-Instance Segmentation of 3D Point Clouds by Context Fusion with Self-Attention", ACM, 12 October 2020 (2020-10-12) *
ZHUYANG XIE,ET AL.: "Point Clouds Learning with Attention-based Graph Convolution Networks", ARXIV, 31 May 2019 (2019-05-31) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129118A (en) * 2023-01-17 2023-05-16 华北水利水电大学 Urban scene laser LiDAR point cloud semantic segmentation method based on graph convolution
CN116129118B (en) * 2023-01-17 2023-10-20 华北水利水电大学 Urban scene laser LiDAR point cloud semantic segmentation method based on graph convolution
CN116403058A (en) * 2023-06-09 2023-07-07 昆明理工大学 Remote sensing cross-scene multispectral laser radar point cloud classification method
CN116403058B (en) * 2023-06-09 2023-09-12 昆明理工大学 Remote sensing cross-scene multispectral laser radar point cloud classification method

Similar Documents

Publication Publication Date Title
CN111242208B (en) Point cloud classification method, segmentation method and related equipment
WO2024077812A1 (en) Single building three-dimensional reconstruction method based on point cloud semantic segmentation and structure fitting
Racah et al. Extremeweather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events
CN112184752A (en) Video target tracking method based on pyramid convolution
Wang et al. Multifocus image fusion using convolutional neural networks in the discrete wavelet transform domain
CN112991350B (en) RGB-T image semantic segmentation method based on modal difference reduction
CN114092697B (en) Building facade semantic segmentation method with attention fused with global and local depth features
CN114358246A (en) Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene
CN112733693B (en) Multi-scale residual error road extraction method for global perception high-resolution remote sensing image
CN106780639B (en) Hash coding method based on significance characteristic sparse embedding and extreme learning machine
CN110991444A (en) Complex scene-oriented license plate recognition method and device
CN116740439A (en) Crowd counting method based on trans-scale pyramid convertors
CN111612046B (en) Feature pyramid graph convolution neural network and application thereof in 3D point cloud classification
CN109242854A (en) A kind of image significance detection method based on FLIC super-pixel segmentation
CN114299305A (en) Salient object detection algorithm for aggregating dense and attention multi-scale features
CN117765258A (en) Large-scale point cloud semantic segmentation method based on density self-adaption and attention mechanism
CN117475228A (en) Three-dimensional point cloud classification and segmentation method based on double-domain feature learning
CN116386042A (en) Point cloud semantic segmentation model based on three-dimensional pooling spatial attention mechanism
CN116935249A (en) Small target detection method for three-dimensional feature enhancement under unmanned airport scene
CN113903016B (en) Bifurcation point detection method, bifurcation point detection device, computer equipment and storage medium
CN116977265A (en) Training method and device for defect detection model, computer equipment and storage medium
CN104408158A (en) Viewpoint tracking method based on geometrical reconstruction and semantic integration
CN115631412A (en) Remote sensing image building extraction method based on coordinate attention and data correlation upsampling
CN114677508A (en) Point cloud instance semantic segmentation method based on dynamic filtering and point-by-point correlation
CN112365456A (en) Transformer substation equipment classification method based on three-dimensional point cloud data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination