CN112990336A - Depth three-dimensional point cloud classification network construction method based on competitive attention fusion - Google Patents
Depth three-dimensional point cloud classification network construction method based on competitive attention fusion Download PDFInfo
- Publication number
- CN112990336A CN112990336A CN202110347537.5A CN202110347537A CN112990336A CN 112990336 A CN112990336 A CN 112990336A CN 202110347537 A CN202110347537 A CN 202110347537A CN 112990336 A CN112990336 A CN 112990336A
- Authority
- CN
- China
- Prior art keywords
- point cloud
- fusion
- feature
- dimensional
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000004927 fusion Effects 0.000 title claims abstract description 73
- 230000002860 competitive effect Effects 0.000 title claims abstract description 53
- 238000010276 construction Methods 0.000 title claims description 11
- 238000000034 method Methods 0.000 claims abstract description 21
- 238000000605 extraction Methods 0.000 claims abstract description 16
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 230000006870 function Effects 0.000 claims description 18
- 238000011176 pooling Methods 0.000 claims description 10
- 238000005070 sampling Methods 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 7
- 230000002776 aggregation Effects 0.000 claims description 6
- 238000004220 aggregation Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 2
- 229930186005 sigmoidin Natural products 0.000 claims 1
- 230000002708 enhancing effect Effects 0.000 abstract 1
- 238000011960 computer-aided design Methods 0.000 description 51
- 238000012360 testing method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 5
- 230000036039 immunity Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 238000013432 robust analysis Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method for constructing a deep three-dimensional point cloud classification network based on competitive attention fusion. Firstly, preprocessing an original point cloud to obtain an input point cloud, then extracting high-dimensional features through two layers of competitive attention fusion feature abstraction layers, and finally sending the high-dimensional features into a classifier to obtain a classification score. The competitive attention fusion feature abstraction layer firstly obtains high-dimensional features of input data through the feature extraction layer, then sends the high-dimensional features and the original input data into the CAF module together for feature fusion, and outputs the fusion features as the module. The core CAF module of the invention focuses on the extraction and fusion of global features of different levels, measures the intrinsic similarity of the features, can be applied to different point cloud classification networks in an embedded manner, has mobility and expansibility, improves the expression capability of the global features of the network, and is obviously helpful for enhancing the robustness of the model against noise.
Description
Technical Field
The invention relates to a deep three-dimensional point cloud classification network construction method based on competitive attention fusion, belongs to the technical field of three-dimensional point cloud classification in computer vision, and is particularly suitable for a point cloud classification task containing noise interference.
Background
In computer vision applications, the analysis processing of two-dimensional images sometimes fails to meet the requirements of practical applications. The three-dimensional point cloud data greatly makes up for the deficiency of the two-dimensional image in many application scenes on the spatial structure information. With the development of deep learning and neural networks, research on three-dimensional point clouds has shifted from low-dimensional geometric features to high-dimensional semantic understanding. Many recent studies adopt learning methods based on deep neural networks, and such methods can be further classified according to different three-dimensional data expression modes: methods based on manual feature preprocessing, multi-view based, voxel based, and raw point cloud data.
The original three-dimensional data is simple to express, the original three-dimensional representation of the object can be displayed better, and the three-dimensional point cloud is used as input, so that adverse factors caused by inputting regular data such as multiple views and voxels in a convolution network, such as unnecessary volume division and influence on invariance of point cloud data, are avoided. Due to the influence of the acquisition equipment and the coordinate system, the arrangement sequence of the obtained three-dimensional point cloud data is greatly different. Aiming at the problem of classification and segmentation of disordered point cloud data, a PointNet network creatively proposes to directly process sparse unstructured point clouds and obtain global features by using a multilayer perceptron and maximum pooling. Since then, researchers have proposed many PointNet-based network frameworks such as PointNet + +, PCPNet, SO-Net, and others. In addition, for the problem of classification and segmentation of three-dimensional Point cloud data, other researches propose famous network frames such as PointCNN, densipoint, Point2Sequence, a-CNN, PointWeb and the like, and other methods adopt a graph convolution network to learn local graphs or geometric elements, but the methods also have problems, such as lack of display semantic abstraction from local to global, or greater complexity.
The deep three-dimensional point cloud classification network is used for researching main contradictions in point cloud feature extraction, and the purpose of the deep three-dimensional point cloud classification network is to improve the classification precision and efficiency of models, enhance the robustness and the like. The optimization of feature extraction capability and the improvement of resistance to disturbance factors such as disturbance, outliers and random noise are two very important research hotspots in a point cloud processing task, are key problems to be solved urgently, and have very important influence on a three-dimensional point cloud classification task and application thereof.
Disclosure of Invention
The technical problem is as follows: in order to improve the extraction and expression capacity of a three-dimensional point cloud deep network classification model on global features and enhance the robustness of the model on noise interference, the invention provides a deep three-dimensional point cloud classification network construction method based on competitive attention fusion. The core technology of the method is to provide a CAF module (Competitive Attention Fusion module, namely a CAF module for short, having the English name of Competitive Attention Fusion Block) to learn the global representation and the internal similarity of the intermediate features of the multi-level features and redistribute the weight of the intermediate feature channel. The module has independence and mobility, has better global feature extraction capability, focuses on core backbone features more beneficial to three-dimensional point cloud shape classification, and resists the influences of point cloud disturbance, outlier noise and random noise to a certain extent.
The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:
a depth three-dimensional point cloud classification network construction method based on competitive attention fusion comprises the following steps:
step 1: preprocessing original point cloud data;
step 2: constructing a CAF module to form a competitive attention fusion feature abstraction layer;
and step 3: stacking two competitive attention fusion feature abstraction layers to construct a deep three-dimensional point cloud classification network;
and 4, step 4: and sending the high-dimensional features finally output by the second layer competitive attention fusion feature abstraction layer to a classifier to obtain a classification result.
Further, the preprocessing of the original point cloud data in the step 1 includes the following steps:
b samples are processed in parallel in batches, N original point cloud data of each sample are preprocessed, and the specific method is that the samples are sampled in a down-sampling mode to obtain N-containing point cloud data0Sampling result P of individual point cloud dataSample。
Further, the step 2 of constructing the competitive attention fusion feature abstraction layer specifically includes the following steps:
the competitive attention fusion feature abstraction layer is composed of a feature extraction layer and a CAF module, and firstly, the feature extraction layer receives input data D from the competitive attention fusion feature abstraction layerinExtracting high-dimensional characteristics F of input data through multilayer convolution networkextTo input data DinAnd high dimensional feature FextThe two are taken as the input of a CAF module, and feature fusion is carried out in the CAF module;
the CAF module comprises an MFSE sub-module (i.e., a Multi-layer Feature Squeeze Excitation sub-module, namely a Multi-layer Feature Squeeze and Excitation Block for short) and a FICSA sub-module (i.e., a Feature intrinsic Self-Attention sub-module, namely a Feature intrinsic Connection Self-Attention Block for short), wherein:
the MFSE submodule focuses on extraction and fusion of global features of different levels, and the MFSE submodule inputs input data of the CAF submoduleAnd high dimensional featuresSeparately performing pooling and encoding operations, whereinThe number is a real number set,representing a dimension N within a real number rangei×CiOf a two-dimensional matrix of, NiIs the point cloud number of the current stage sample, CiThe number of characteristic channels of the sample at the current stage, i is the serial number of 5 stages with different matrix dimensions, and the coded characteristic is obtained(N 31 is FMFSE-inPoint cloud number of C3=C1R is FMFSE-inCharacteristic number of channels) and(N 41 is FMFSE-extPoint cloud number of C4=C2R is FMFSE-extThe number of characteristic channels) as follows:
where P (-) is the Max pooling function of global feature aggregation Max pooling, φ (-) is the fully connected layer and Relu activation functions, and the channel scaling r is used to adjust the number of intermediate channels;
then, stacking the two coding features according to the channel direction to obtain a stacking resultN 51 is FMFSE-ConcatPoint cloud number of C5=(C1+C2) R is FMFSE-ConcatThe formula of the characteristic channel number is as follows:
then, the channel number and the feature map size of the stacking result are expanded to be equal to the high-dimensional feature F through the full-connection layerextThe same dimension, using the feature as the output F of the MFSE submoduleMFSEThe formula is as follows:
whereinFor the full connectivity layer extension procedure with the normalization function Sigmoid,the global attention weight finally obtained by the MFSE submodule;
the FICSA sub-module aims at measuring the intrinsic similarity of the features, and the FICSA sub-module inputs the high-dimensional features of the CAF modulePerforming 1 × 1 point-to-point convolution operation, and linearly mapping the features of all channels of each point to three parallel high-dimensional features, wherein the formula is as follows:
wherein V (·), Q (·) and K (·) are three independent feature mapping functions respectively to obtain three corresponding advanced features, and the dimensions are N2×C2,wiFor different linear transformation coefficients, subsequently, similarity calculation is carried out, and the correlation between Q (-) and K (-) is obtained through dot product operation, wherein the formula is as follows:
wherein A (-) is a high-dimensional relation in the middle characteristic, gamma is a Softmax normalization function with aggregation function,is a selectable channel scaling coefficient set for reducing the number of training parameters, and finally obtains a global attention weight F of the internal association of the characteristic points and the characteristic pointsFICSAThe formula is as follows:
FFICSA=γ(A(Fext)V(Fext)) (6)
wherein, V (-) is used for adjusting the feature channel dimension of A (-) and taking the feature as the final output of FICSA submodule
Finally, the CAF module outputs F of the MFSE submoduleMFSEAnd output F of FICSA submoduleFICSACompetitive weight fusion is carried out, residual learning is introduced, the weight of the characteristic channel is redistributed, and the formula is as follows:
FCAF=αFMFSE+βFFICSA (7)
through matrix addition, after the global attention weight is fused according to different proportionality coefficients alpha and beta, the final weight distribution coefficient is obtainedObtaining output of CAF module by weight redistribution and residual connection
FFusion=Fext+FCAFFext (8)
Output F of CAF moduleFusionI.e. the output of the competitive attention fusion feature abstraction layer.
Further, two competitive attention fusion feature abstraction layers are stacked in the step 3, and the constructing of the deep three-dimensional point cloud classification network specifically includes the following steps:
the sampling result P in the step 1 is processedSampleSending the data as input into a first layer competitive attention fusion feature abstract layer to obtain a fused feature FFusion-Mid(ii) a The fused feature FFusion-MidThen as input, sending the data into a second layer competitive attention fusion feature abstract layer to obtain the final fusion feature FFusion-Final。
Further, the step 4 of sending the high-dimensional features finally output by the second layer competitive attention fusion feature abstraction layer to the classifier includes the following steps:
after a second layer of competitive attention fusion feature abstraction layer, a multi-layer perceptron (MLP) is introduced as a classifier, and classification learning is carried out on input point cloud fusion features to obtain classification scores.
Has the advantages that: the invention provides a method for constructing a deep three-dimensional point cloud classification network based on competitive attention fusion, which is characterized in that the core is a CAF (computer aided design) module which is a migratable intermediate characteristic channel optimization structure, residual connection and channel competitiveness are introduced, and the weight of a characteristic channel is redistributed by learning by taking two kinds of attention as the core. The CAF module contains two sub-modules: 1) the MFSE submodule focuses on extraction and fusion of global features of different levels; 2) and the FICSA sub-module measures the inherent similarity of the intermediate features. The CAF module can be applied to different point cloud classification networks in an embedded mode, has mobility and expansibility, improves the expression capability of global characteristics of the point cloud, and strengthens the robustness of the model to noise interference.
The point cloud feature extraction network adopts two or more intermediate feature abstraction layers, and the intermediate features are usually a set of global features and local features, so that the accuracy of classification results is influenced to a great extent. The CAF module provided by the invention obtains a fusion weight through the learning of two layers of intermediate output features, the weight can represent the importance and expressive force of the intermediate feature channel of the current layer, and the new optimized intermediate feature is obtained by redistributing the channel features through the weight. In brief, the CAF module utilizes the central idea of the attention mechanism to aggregate the salient features, excite the channel features which are more important and have larger influence on the result, suppress the invalid or ineffective channel features, reduce noise interference and improve the robustness of the model.
Noise interference in the actual point cloud includes disturbance and outlier, which is often represented as position offset of the sample partial point set, and background noise exists. The set of noise points is also considered to be part of the sample when testing the model, thus affecting the classification result of the sample. The role of the CAF module in the network is to make the model focus more on the core features that determine the sample type by adjusting the weights of the intermediate feature channels. The two sub-modules learn from two different angles associated with the global features and the intermediate features of multiple levels to obtain weights which are more beneficial to focusing a core channel, so that the learning capacity of the network on the global features is improved, the anti-interference capacity of the model is enhanced, and the method helps to solve the difficult problem in the point cloud depth network.
Drawings
FIG. 1 is a flow chart of a method for constructing a deep three-dimensional point cloud classification network based on competitive attention fusion;
FIG. 2 is a schematic diagram of a competitive attention fusion feature abstraction layer provided by the present invention;
FIG. 3 is a schematic diagram of an MFSE sub-module in a CAF module provided by the present invention;
FIG. 4 is a diagram of a FICASA sub-module in a CAF module provided by the present invention;
FIG. 5(a) is the anti-interference performance of the CAF module to point cloud disturbance (Gaussian noise);
FIG. 5(b) is the immunity of the CAF module to outliers (random noise);
FIG. 6(a) is the effect of CAF modules on model robustness over Pointnet + +;
FIG. 6(b) is the effect of CAF modules on model robustness at PointASNL;
fig. 6(c) is the ultimate immunity of the CAF module to interference on PointASNL.
Detailed Description
The invention is further elucidated with reference to the drawings and the embodiments.
Under a Ubuntu operating system, TensorFlow is selected as a platform, a deep three-dimensional point cloud classification network based on competitive attention fusion is built, and the effectiveness of the CAF module is verified on a classical reference network Pointnet + + and a reference network PointASNL with excellent performance in recent years. After the result display is added into the CAF module, the anti-interference capability of the network to the point cloud noise can be obviously enhanced under the condition of keeping the average accuracy of the classification result not to be reduced. The robustness of the model can be further improved while the classification precision is kept stable by adjusting the number of training sample input points.
A depth three-dimensional point cloud classification network construction method based on competitive attention fusion is disclosed, and a network framework is shown in figure 1. Wherein the competitive attention fusion feature abstraction layer structure is shown in fig. 2. Fig. 3 is a schematic diagram of an MFSE sub-module in a CAF module provided by the present invention. Fig. 4 is a schematic diagram of a FICSA sub-module in a CAF module provided in the present invention.
The method specifically comprises the following steps:
step 1: the method comprises the steps of preprocessing original point cloud data, parallelly batching B to 24 samples, preprocessing N to 10000 original point cloud data of each sample, and sampling in a down-sampling mode to obtain the original point cloud data containing N01024 point cloud data sampling results
Step 2: constructing a CAF module to form a competitive attention fusion feature abstraction Layer, wherein two layers of competitive attention fusion feature abstraction layers, i.e. Layer _1 and Layer _2, are respectively composed of two parts, firstly, the feature extraction Layer receives input data D from the competitive attention fusion feature abstraction LayerinIn Layer _1The input in Layer _2 is the final output result of Layer _1, i.e. the output result isExtracting high-dimensional features F of input data through multilayer convolution networkextIn Layer _1In Layer _2To input data DinAnd high dimensional feature FextThe two are taken as the input of a CAF module, and feature fusion is carried out in the CAF module;
the CAF module comprises an MFSE sub-module and a FICSA sub-module:
the MFSE submodule focuses on extraction and fusion of global features of different levels, and inputs input data D of the CAF submodule into the MFSE submodulein(in Layer _ 1)In Layer _2) And high dimensional feature Fext(in Layer _ 1)In Layer _2) Respectively performing pooling and encoding operations to obtain encoded features FMFSE-in(in Layer _ 1)In Layer _2) And FMFSE-ext(in Layer _ 1)In Layer _2) The formula is as follows:
wherein P (-) is a Max pooling function Max pooling of global feature aggregation,. phi (-) is a full connectivity layer and Relu activation function, and a channel scaling r 4 is used for adjusting the number of intermediate channels;
then, stacking the two coding features according to the channel direction to obtain a stacking result FMFSE-ConcatIn Layer _1In Layer _2The formula is as follows:
then, the channel number and the feature map size of the stacking result are expanded to be equal to the high-dimensional feature F through the full-connection layerextThe same dimension, using the feature as the output F of the MFSE submoduleMFSEThe formula is as follows:
whereinFor a full connection layer extension process with a normalization function Sigmoid, FMFSEThe global attention weight finally obtained for the MFSE submodule, in Layer _1In Layer _2
The FICSA sub-module aims at measuring the intrinsic similarity of the features, and the FICSA sub-module inputs the high-dimensional features of the CAF modulePerforming 1 × 1 point-to-point convolution operation, and linearly mapping the features of all channels of each point to three parallel high-dimensional features, wherein the formula is as follows:
wherein V (·), Q (·) and K (·) are three independent feature mapping functions respectively to obtain three corresponding advanced features, and the dimensions are N2×C2,wiAutomatically learning the different linear conversion coefficients in the training process, then carrying out similarity calculation through dot productThe operation obtains the correlation between Q (-) and K (-) and the formula is as follows:
wherein A (-) is a high-dimensional relation in the middle characteristic, gamma is a Softmax normalization function with aggregation function,is a selectable channel scaling coefficient set for reducing the number of training parameters, and finally obtains a global attention weight F of the internal association of the characteristic points and the characteristic pointsFICSAThe formula is as follows:
FFICSA=γ(A(Fext)V(Fext)) (6)
wherein, V (-) is used for adjusting the feature channel dimension of A (-) and taking the feature as the final output of FICSA submoduleIn Layer _1In Layer _2
Finally, the CAF module outputs F of the MFSE submoduleMFSEAnd output F of FICSA submoduleFICSACompetitive weight fusion is carried out, residual learning is introduced, the weight of the characteristic channel is redistributed, and the formula is as follows:
FCAF=FMFSE+FFICSA (7)
through matrix addition, after the global attention weight is fused according to the proportion coefficient alpha being 1 and the beta being 1, the final weight distribution coefficient is obtainedIn Layer _1In Layer _2Obtaining output of CAF module by weight redistribution and residual connection
FFusion=Fext+FCAFFext (8)
Output F of CAF moduleFusionI.e. the output of the competitive attention fusion feature abstraction Layer, in Layer _1In Layer _2
And step 3: stacking two competitive attention fusion feature abstraction layers, namely Layer _1 and Layer _2, constructing a deep three-dimensional point cloud classification network, and sampling results in the step 1As input, the first competitive attention fusion feature abstract Layer 1 is input to obtain the fused featureCombining the fused featuresThen as input, the input is sent to a second competitive attention fusion feature abstract Layer 2 to obtain the final fusion feature
And 4, step 4: and (3) sending the high-dimensional features finally output by the second Layer of competitive attention fusion feature abstraction Layer 2 into a classifier to obtain a classification result, introducing a multi-Layer perceptron (MLP) as the classifier after the competitive attention fusion feature abstraction Layer 2, wherein parameters of an MLP output channel are [256,512,1024,512,256 and 40], and carrying out classification learning on the input point cloud fusion features to obtain a classification score.
The experimental results are specifically as follows:
experiment 1: and (5) classifying the shapes. The CAF module is added into the Pointnet + +, the optimal classification precision is 90.7% when the Pointnet + + is reproduced, the average test precision reaches 91.0% after the CAF module is added, and the result proves the effectiveness and feasibility of the CAF module in maintaining and improving the classification precision. Adding a CAF module in the PointASNL, and when only coordinate points are input, enabling the classification precision to reach 92.9 percent (92.88 percent) and be not lower than 92.9 percent (92.85 percent of the actual test optimal classification precision) in the PointASNL; when the normal vector is added in training and testing, the classification precision reaches 93.2 percent (93.19 percent) and is not lower than 93.2 percent in PointASNL (the optimal classification precision is 93.15 percent in actual testing). The experimental result proves the independence and the mobility of the CAF module and helps to maintain the classification precision.
Experiment 2: and (5) carrying out robustness analysis.
Adding Gaussian noise simulation disturbance to the point cloud, and adopting standard normal distribution; random noise is added to the point cloud to simulate outliers, and the noise range is in the range of [ -1.0,1.0 ]. The ability of the CAF module to resist disturbances (gauss) and outliers (random) was tested using PointASNL as the reference network (Base). The result is shown in fig. 5, and after the CAF module is added, the anti-interference performance of the model on two noise types, namely point cloud disturbance and outlier, is obviously improved.
And replacing a certain number of real point sets with random noise within the range of [ -1.0,1.0], and simulating the situation of simultaneous data loss and noise interference, wherein the number of the random noise is [0,1,10,50,100 ]. Fig. 6(a) shows the classification accuracy of the pointent + + network added with the CAF module and the original network on the test set with data loss and random noise, and as the amount of noise increases, the network classification accuracy added with the CAF module decreases more slowly, and the robustness of the model is significantly improved. Fig. 6(b) shows the classification accuracy of the PointASNL network with the CAF module on the test set with data loss and random noise, and for the model with 1024 point training input, under the same condition, after the CAF module is added, the network anti-interference capability is improved under the condition of different amounts of data loss and random noise, and the better anti-interference capability can be obtained by adding the number of input point clouds to 2048 and 3000 while maintaining the stable classification performance. Fig. 6(c) shows the ultimate immunity of the CAF module to interference on PointASNL.
It should be noted that the above-mentioned embodiments are only examples for clearly illustrating the present invention, and are not limitations of the embodiments, and all embodiments cannot be exhaustive here. All parts not specified in the present embodiment can be realized by using the prior art. It will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.
Claims (5)
1. A depth three-dimensional point cloud classification network construction method based on competitive attention fusion is characterized by comprising the following steps:
step 1: preprocessing original point cloud data;
step 2: constructing a CAF module to form a competitive attention fusion feature abstraction layer;
and step 3: stacking two competitive attention fusion feature abstraction layers to construct a deep three-dimensional point cloud classification network;
and 4, step 4: and sending the high-dimensional features finally output by the second layer competitive attention fusion feature abstraction layer to a classifier to obtain a classification result.
2. The competitive attention fusion-based deep three-dimensional point cloud classification network construction method according to claim 1, characterized in that: in the step 1, in the process of preprocessing the original point cloud data, B samples are processed in parallel in batches, and N original point cloud data of each sample are preprocessed0Sampling result P of individual point cloud dataSample。
3. The competitive attention fusion-based deep three-dimensional point cloud classification network construction method according to claim 1, characterized in that: the competitive attention fusion feature abstraction layer in the step 2 consists of a feature extraction layer and a CAF module, wherein the feature extraction layer receives input data D from the competitive attention fusion feature abstraction layerinExtracting high-dimensional characteristics F of input data through multilayer convolution networkextTo input data DinAnd high dimensional feature FextThe two are taken as the input of a CAF module, and feature fusion is carried out in the CAF module;
the CAF module comprises an MFSE sub-module and a FICSA sub-module:
the MFSE submodule focuses on extraction and fusion of global features of different levels, and the MFSE submodule inputs input data of the CAF submoduleAnd high dimensional featuresSeparately performing pooling and encoding operations, whereinThe number is a real number set,representing a dimension N within a real number rangei×CiOf a two-dimensional matrix of, NiIs the point cloud number of the current stage sample, CiThe number of characteristic channels of the sample at the current stage, i is the serial number of 5 stages with different matrix dimensions, and the coded characteristic is obtained(N31 is FMFSE-inPoint cloud number of C3=C1R is FMFSE-inCharacteristic number of channels) and(N41 is FMFSE-extPoint cloud number of C4=C2R is FMFSE-extThe number of characteristic channels) as follows:
where P (-) is the Max pooling function of global feature aggregation Max pooling, φ (-) is the fully connected layer and Relu activation functions, and the channel scaling r is used to adjust the number of intermediate channels;
then, stacking the two coding features according to the channel direction to obtain a stacking resultN51 is FMFSE-ConcatPoint cloud number of C5=(C1+C2) R is FMFSE-ConcatThe formula of the characteristic channel number is as follows:
then, the channel number and the feature map size of the stacking result are expanded to be equal to the high-dimensional feature F through the full-connection layerextThe same dimension, using the feature as the output F of the MFSE submoduleMFSEThe formula is as follows:
whereinFor full-link layers containing a normalization function SigmoidIn the process of exhibition,the global attention weight finally obtained by the MFSE submodule;
the FICSA sub-module aims at measuring the intrinsic similarity of the features, and the FICSA sub-module inputs the high-dimensional features of the CAF modulePerforming 1 × 1 point-to-point convolution operation, and linearly mapping the features of all channels of each point to three parallel high-dimensional features, wherein the formula is as follows:
wherein V (·), Q (·) and K (·) are three independent feature mapping functions respectively to obtain three corresponding advanced features, and the dimensions are N2×C2,wiFor different linear transformation coefficients, subsequently, similarity calculation is carried out, and the correlation between Q (-) and K (-) is obtained through dot product operation, wherein the formula is as follows:
wherein A (-) is a high-dimensional relation in the middle characteristic, gamma is a Softmax normalization function with aggregation function,is a selectable channel scaling coefficient set for reducing the number of training parameters, and finally obtains a global attention weight F of the internal association of the characteristic points and the characteristic pointsFICSAThe formula is as follows:
FFICSA=γ(A(Fext)V(Fext)) (6)
wherein, V (-) is used for adjusting the dimension of the characteristic channel of A (-) and using the characteristic as the FICSA sub-moldFinal output of block
Finally, the CAF module outputs F of the MFSE submoduleMFSEAnd output F of FICSA submoduleFICSACompetitive weight fusion is carried out, residual learning is introduced, the weight of the characteristic channel is redistributed, and the formula is as follows:
FCAF=αFMFSE+βFFICSA (7)
through matrix addition, after the global attention weight is fused according to different proportionality coefficients alpha and beta, the final weight distribution coefficient is obtainedObtaining output of CAF module by weight redistribution and residual connection
FFusion=Fext+FCAFFext (8)
Output F of CAF moduleFusionI.e. the output of the competitive attention fusion feature abstraction layer.
4. The competitive attention fusion-based deep three-dimensional point cloud classification network construction method according to claim 1, characterized in that: the specific method for stacking two competitive attention fusion feature abstraction layers in the step 3 is as follows: the sampling result P in the step 1 is processedSampleSending the data as input into a first layer competitive attention fusion feature abstract layer to obtain a fused feature FFusion-Mid(ii) a The fused feature FFusion-MidThen as input, sending the data into a second layer competitive attention fusion feature abstract layer to obtain the final fusion feature FFusion-Final。
5. The competitive attention fusion-based deep three-dimensional point cloud classification network construction method according to claim 1, characterized in that: the specific method for sending the high-dimensional features finally output by the second layer of competitive attention fusion feature abstraction layer to the classifier in the step 4 is to introduce a multilayer perceptron (MLP) as the classifier after the second layer of competitive attention fusion feature abstraction layer, and perform classification learning on the input point cloud fusion features to obtain the classification scores.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110347537.5A CN112990336B (en) | 2021-03-31 | 2021-03-31 | Deep three-dimensional point cloud classification network construction method based on competitive attention fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110347537.5A CN112990336B (en) | 2021-03-31 | 2021-03-31 | Deep three-dimensional point cloud classification network construction method based on competitive attention fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112990336A true CN112990336A (en) | 2021-06-18 |
CN112990336B CN112990336B (en) | 2024-03-26 |
Family
ID=76339112
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110347537.5A Active CN112990336B (en) | 2021-03-31 | 2021-03-31 | Deep three-dimensional point cloud classification network construction method based on competitive attention fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112990336B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117788962A (en) * | 2024-02-27 | 2024-03-29 | 南京信息工程大学 | Extensible point cloud target identification method and system based on continuous learning |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111242208A (en) * | 2020-01-08 | 2020-06-05 | 深圳大学 | Point cloud classification method, point cloud segmentation method and related equipment |
CN112085123A (en) * | 2020-09-25 | 2020-12-15 | 北方民族大学 | Point cloud data classification and segmentation method based on salient point sampling |
-
2021
- 2021-03-31 CN CN202110347537.5A patent/CN112990336B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111242208A (en) * | 2020-01-08 | 2020-06-05 | 深圳大学 | Point cloud classification method, point cloud segmentation method and related equipment |
CN112085123A (en) * | 2020-09-25 | 2020-12-15 | 北方民族大学 | Point cloud data classification and segmentation method based on salient point sampling |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117788962A (en) * | 2024-02-27 | 2024-03-29 | 南京信息工程大学 | Extensible point cloud target identification method and system based on continuous learning |
CN117788962B (en) * | 2024-02-27 | 2024-05-10 | 南京信息工程大学 | Extensible point cloud target identification method based on continuous learning |
Also Published As
Publication number | Publication date |
---|---|
CN112990336B (en) | 2024-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112396607B (en) | Deformable convolution fusion enhanced street view image semantic segmentation method | |
CN108171701B (en) | Significance detection method based on U network and counterstudy | |
CN110390638B (en) | High-resolution three-dimensional voxel model reconstruction method | |
CN111292330A (en) | Image semantic segmentation method and device based on coder and decoder | |
CN115063573B (en) | Multi-scale target detection method based on attention mechanism | |
CN107220506A (en) | Breast cancer risk assessment analysis system based on deep convolutional neural network | |
CN108804397A (en) | A method of the Chinese character style conversion based on a small amount of target font generates | |
CN107229757A (en) | The video retrieval method encoded based on deep learning and Hash | |
CN113344188A (en) | Lightweight neural network model based on channel attention module | |
CN111259904B (en) | Semantic image segmentation method and system based on deep learning and clustering | |
CN113240683B (en) | Attention mechanism-based lightweight semantic segmentation model construction method | |
CN108984642A (en) | A kind of PRINTED FABRIC image search method based on Hash coding | |
CN110111365B (en) | Training method and device based on deep learning and target tracking method and device | |
CN113066089B (en) | Real-time image semantic segmentation method based on attention guide mechanism | |
CN107133640A (en) | Image classification method based on topography's block description and Fei Sheer vectors | |
CN114648535A (en) | Food image segmentation method and system based on dynamic transform | |
CN116310339A (en) | Remote sensing image segmentation method based on matrix decomposition enhanced global features | |
CN111652273A (en) | Deep learning-based RGB-D image classification method | |
CN117237559A (en) | Digital twin city-oriented three-dimensional model data intelligent analysis method and system | |
CN115147601A (en) | Urban street point cloud semantic segmentation method based on self-attention global feature enhancement | |
CN109508639A (en) | Road scene semantic segmentation method based on multiple dimensioned convolutional neural networks with holes | |
CN115222754A (en) | Mirror image segmentation method based on knowledge distillation and antagonistic learning | |
CN117593666B (en) | Geomagnetic station data prediction method and system for aurora image | |
CN110728186A (en) | Fire detection method based on multi-network fusion | |
CN117975013A (en) | Point cloud segmentation method based on cross attention and multi-scale feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |