CN112990336A - Depth three-dimensional point cloud classification network construction method based on competitive attention fusion - Google Patents

Depth three-dimensional point cloud classification network construction method based on competitive attention fusion Download PDF

Info

Publication number
CN112990336A
CN112990336A CN202110347537.5A CN202110347537A CN112990336A CN 112990336 A CN112990336 A CN 112990336A CN 202110347537 A CN202110347537 A CN 202110347537A CN 112990336 A CN112990336 A CN 112990336A
Authority
CN
China
Prior art keywords
point cloud
fusion
feature
dimensional
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110347537.5A
Other languages
Chinese (zh)
Other versions
CN112990336B (en
Inventor
达飞鹏
陈涵娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202110347537.5A priority Critical patent/CN112990336B/en
Publication of CN112990336A publication Critical patent/CN112990336A/en
Application granted granted Critical
Publication of CN112990336B publication Critical patent/CN112990336B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for constructing a deep three-dimensional point cloud classification network based on competitive attention fusion. Firstly, preprocessing an original point cloud to obtain an input point cloud, then extracting high-dimensional features through two layers of competitive attention fusion feature abstraction layers, and finally sending the high-dimensional features into a classifier to obtain a classification score. The competitive attention fusion feature abstraction layer firstly obtains high-dimensional features of input data through the feature extraction layer, then sends the high-dimensional features and the original input data into the CAF module together for feature fusion, and outputs the fusion features as the module. The core CAF module of the invention focuses on the extraction and fusion of global features of different levels, measures the intrinsic similarity of the features, can be applied to different point cloud classification networks in an embedded manner, has mobility and expansibility, improves the expression capability of the global features of the network, and is obviously helpful for enhancing the robustness of the model against noise.

Description

Depth three-dimensional point cloud classification network construction method based on competitive attention fusion
Technical Field
The invention relates to a deep three-dimensional point cloud classification network construction method based on competitive attention fusion, belongs to the technical field of three-dimensional point cloud classification in computer vision, and is particularly suitable for a point cloud classification task containing noise interference.
Background
In computer vision applications, the analysis processing of two-dimensional images sometimes fails to meet the requirements of practical applications. The three-dimensional point cloud data greatly makes up for the deficiency of the two-dimensional image in many application scenes on the spatial structure information. With the development of deep learning and neural networks, research on three-dimensional point clouds has shifted from low-dimensional geometric features to high-dimensional semantic understanding. Many recent studies adopt learning methods based on deep neural networks, and such methods can be further classified according to different three-dimensional data expression modes: methods based on manual feature preprocessing, multi-view based, voxel based, and raw point cloud data.
The original three-dimensional data is simple to express, the original three-dimensional representation of the object can be displayed better, and the three-dimensional point cloud is used as input, so that adverse factors caused by inputting regular data such as multiple views and voxels in a convolution network, such as unnecessary volume division and influence on invariance of point cloud data, are avoided. Due to the influence of the acquisition equipment and the coordinate system, the arrangement sequence of the obtained three-dimensional point cloud data is greatly different. Aiming at the problem of classification and segmentation of disordered point cloud data, a PointNet network creatively proposes to directly process sparse unstructured point clouds and obtain global features by using a multilayer perceptron and maximum pooling. Since then, researchers have proposed many PointNet-based network frameworks such as PointNet + +, PCPNet, SO-Net, and others. In addition, for the problem of classification and segmentation of three-dimensional Point cloud data, other researches propose famous network frames such as PointCNN, densipoint, Point2Sequence, a-CNN, PointWeb and the like, and other methods adopt a graph convolution network to learn local graphs or geometric elements, but the methods also have problems, such as lack of display semantic abstraction from local to global, or greater complexity.
The deep three-dimensional point cloud classification network is used for researching main contradictions in point cloud feature extraction, and the purpose of the deep three-dimensional point cloud classification network is to improve the classification precision and efficiency of models, enhance the robustness and the like. The optimization of feature extraction capability and the improvement of resistance to disturbance factors such as disturbance, outliers and random noise are two very important research hotspots in a point cloud processing task, are key problems to be solved urgently, and have very important influence on a three-dimensional point cloud classification task and application thereof.
Disclosure of Invention
The technical problem is as follows: in order to improve the extraction and expression capacity of a three-dimensional point cloud deep network classification model on global features and enhance the robustness of the model on noise interference, the invention provides a deep three-dimensional point cloud classification network construction method based on competitive attention fusion. The core technology of the method is to provide a CAF module (Competitive Attention Fusion module, namely a CAF module for short, having the English name of Competitive Attention Fusion Block) to learn the global representation and the internal similarity of the intermediate features of the multi-level features and redistribute the weight of the intermediate feature channel. The module has independence and mobility, has better global feature extraction capability, focuses on core backbone features more beneficial to three-dimensional point cloud shape classification, and resists the influences of point cloud disturbance, outlier noise and random noise to a certain extent.
The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:
a depth three-dimensional point cloud classification network construction method based on competitive attention fusion comprises the following steps:
step 1: preprocessing original point cloud data;
step 2: constructing a CAF module to form a competitive attention fusion feature abstraction layer;
and step 3: stacking two competitive attention fusion feature abstraction layers to construct a deep three-dimensional point cloud classification network;
and 4, step 4: and sending the high-dimensional features finally output by the second layer competitive attention fusion feature abstraction layer to a classifier to obtain a classification result.
Further, the preprocessing of the original point cloud data in the step 1 includes the following steps:
b samples are processed in parallel in batches, N original point cloud data of each sample are preprocessed, and the specific method is that the samples are sampled in a down-sampling mode to obtain N-containing point cloud data0Sampling result P of individual point cloud dataSample
Further, the step 2 of constructing the competitive attention fusion feature abstraction layer specifically includes the following steps:
the competitive attention fusion feature abstraction layer is composed of a feature extraction layer and a CAF module, and firstly, the feature extraction layer receives input data D from the competitive attention fusion feature abstraction layerinExtracting high-dimensional characteristics F of input data through multilayer convolution networkextTo input data DinAnd high dimensional feature FextThe two are taken as the input of a CAF module, and feature fusion is carried out in the CAF module;
the CAF module comprises an MFSE sub-module (i.e., a Multi-layer Feature Squeeze Excitation sub-module, namely a Multi-layer Feature Squeeze and Excitation Block for short) and a FICSA sub-module (i.e., a Feature intrinsic Self-Attention sub-module, namely a Feature intrinsic Connection Self-Attention Block for short), wherein:
the MFSE submodule focuses on extraction and fusion of global features of different levels, and the MFSE submodule inputs input data of the CAF submodule
Figure BDA0003001261620000021
And high dimensional features
Figure BDA0003001261620000022
Separately performing pooling and encoding operations, wherein
Figure BDA0003001261620000023
The number is a real number set,
Figure BDA0003001261620000024
representing a dimension N within a real number rangei×CiOf a two-dimensional matrix of, NiIs the point cloud number of the current stage sample, CiThe number of characteristic channels of the sample at the current stage, i is the serial number of 5 stages with different matrix dimensions, and the coded characteristic is obtained
Figure BDA0003001261620000025
(N 31 is FMFSE-inPoint cloud number of C3=C1R is FMFSE-inCharacteristic number of channels) and
Figure BDA0003001261620000026
(N 41 is FMFSE-extPoint cloud number of C4=C2R is FMFSE-extThe number of characteristic channels) as follows:
Figure BDA0003001261620000027
where P (-) is the Max pooling function of global feature aggregation Max pooling, φ (-) is the fully connected layer and Relu activation functions, and the channel scaling r is used to adjust the number of intermediate channels;
then, stacking the two coding features according to the channel direction to obtain a stacking result
Figure BDA0003001261620000031
N
51 is FMFSE-ConcatPoint cloud number of C5=(C1+C2) R is FMFSE-ConcatThe formula of the characteristic channel number is as follows:
Figure BDA0003001261620000032
then, the channel number and the feature map size of the stacking result are expanded to be equal to the high-dimensional feature F through the full-connection layerextThe same dimension, using the feature as the output F of the MFSE submoduleMFSEThe formula is as follows:
Figure BDA0003001261620000033
wherein
Figure BDA0003001261620000034
For the full connectivity layer extension procedure with the normalization function Sigmoid,
Figure BDA0003001261620000035
the global attention weight finally obtained by the MFSE submodule;
the FICSA sub-module aims at measuring the intrinsic similarity of the features, and the FICSA sub-module inputs the high-dimensional features of the CAF module
Figure BDA0003001261620000036
Performing 1 × 1 point-to-point convolution operation, and linearly mapping the features of all channels of each point to three parallel high-dimensional features, wherein the formula is as follows:
Figure BDA0003001261620000037
wherein V (·), Q (·) and K (·) are three independent feature mapping functions respectively to obtain three corresponding advanced features, and the dimensions are N2×C2,wiFor different linear transformation coefficients, subsequently, similarity calculation is carried out, and the correlation between Q (-) and K (-) is obtained through dot product operation, wherein the formula is as follows:
Figure BDA0003001261620000038
wherein A (-) is a high-dimensional relation in the middle characteristic, gamma is a Softmax normalization function with aggregation function,
Figure BDA0003001261620000039
is a selectable channel scaling coefficient set for reducing the number of training parameters, and finally obtains a global attention weight F of the internal association of the characteristic points and the characteristic pointsFICSAThe formula is as follows:
FFICSA=γ(A(Fext)V(Fext)) (6)
wherein, V (-) is used for adjusting the feature channel dimension of A (-) and taking the feature as the final output of FICSA submodule
Figure BDA00030012616200000310
Finally, the CAF module outputs F of the MFSE submoduleMFSEAnd output F of FICSA submoduleFICSACompetitive weight fusion is carried out, residual learning is introduced, the weight of the characteristic channel is redistributed, and the formula is as follows:
FCAF=αFMFSE+βFFICSA (7)
through matrix addition, after the global attention weight is fused according to different proportionality coefficients alpha and beta, the final weight distribution coefficient is obtained
Figure BDA00030012616200000311
Obtaining output of CAF module by weight redistribution and residual connection
Figure BDA00030012616200000312
FFusion=Fext+FCAFFext (8)
Output F of CAF moduleFusionI.e. the output of the competitive attention fusion feature abstraction layer.
Further, two competitive attention fusion feature abstraction layers are stacked in the step 3, and the constructing of the deep three-dimensional point cloud classification network specifically includes the following steps:
the sampling result P in the step 1 is processedSampleSending the data as input into a first layer competitive attention fusion feature abstract layer to obtain a fused feature FFusion-Mid(ii) a The fused feature FFusion-MidThen as input, sending the data into a second layer competitive attention fusion feature abstract layer to obtain the final fusion feature FFusion-Final
Further, the step 4 of sending the high-dimensional features finally output by the second layer competitive attention fusion feature abstraction layer to the classifier includes the following steps:
after a second layer of competitive attention fusion feature abstraction layer, a multi-layer perceptron (MLP) is introduced as a classifier, and classification learning is carried out on input point cloud fusion features to obtain classification scores.
Has the advantages that: the invention provides a method for constructing a deep three-dimensional point cloud classification network based on competitive attention fusion, which is characterized in that the core is a CAF (computer aided design) module which is a migratable intermediate characteristic channel optimization structure, residual connection and channel competitiveness are introduced, and the weight of a characteristic channel is redistributed by learning by taking two kinds of attention as the core. The CAF module contains two sub-modules: 1) the MFSE submodule focuses on extraction and fusion of global features of different levels; 2) and the FICSA sub-module measures the inherent similarity of the intermediate features. The CAF module can be applied to different point cloud classification networks in an embedded mode, has mobility and expansibility, improves the expression capability of global characteristics of the point cloud, and strengthens the robustness of the model to noise interference.
The point cloud feature extraction network adopts two or more intermediate feature abstraction layers, and the intermediate features are usually a set of global features and local features, so that the accuracy of classification results is influenced to a great extent. The CAF module provided by the invention obtains a fusion weight through the learning of two layers of intermediate output features, the weight can represent the importance and expressive force of the intermediate feature channel of the current layer, and the new optimized intermediate feature is obtained by redistributing the channel features through the weight. In brief, the CAF module utilizes the central idea of the attention mechanism to aggregate the salient features, excite the channel features which are more important and have larger influence on the result, suppress the invalid or ineffective channel features, reduce noise interference and improve the robustness of the model.
Noise interference in the actual point cloud includes disturbance and outlier, which is often represented as position offset of the sample partial point set, and background noise exists. The set of noise points is also considered to be part of the sample when testing the model, thus affecting the classification result of the sample. The role of the CAF module in the network is to make the model focus more on the core features that determine the sample type by adjusting the weights of the intermediate feature channels. The two sub-modules learn from two different angles associated with the global features and the intermediate features of multiple levels to obtain weights which are more beneficial to focusing a core channel, so that the learning capacity of the network on the global features is improved, the anti-interference capacity of the model is enhanced, and the method helps to solve the difficult problem in the point cloud depth network.
Drawings
FIG. 1 is a flow chart of a method for constructing a deep three-dimensional point cloud classification network based on competitive attention fusion;
FIG. 2 is a schematic diagram of a competitive attention fusion feature abstraction layer provided by the present invention;
FIG. 3 is a schematic diagram of an MFSE sub-module in a CAF module provided by the present invention;
FIG. 4 is a diagram of a FICASA sub-module in a CAF module provided by the present invention;
FIG. 5(a) is the anti-interference performance of the CAF module to point cloud disturbance (Gaussian noise);
FIG. 5(b) is the immunity of the CAF module to outliers (random noise);
FIG. 6(a) is the effect of CAF modules on model robustness over Pointnet + +;
FIG. 6(b) is the effect of CAF modules on model robustness at PointASNL;
fig. 6(c) is the ultimate immunity of the CAF module to interference on PointASNL.
Detailed Description
The invention is further elucidated with reference to the drawings and the embodiments.
Under a Ubuntu operating system, TensorFlow is selected as a platform, a deep three-dimensional point cloud classification network based on competitive attention fusion is built, and the effectiveness of the CAF module is verified on a classical reference network Pointnet + + and a reference network PointASNL with excellent performance in recent years. After the result display is added into the CAF module, the anti-interference capability of the network to the point cloud noise can be obviously enhanced under the condition of keeping the average accuracy of the classification result not to be reduced. The robustness of the model can be further improved while the classification precision is kept stable by adjusting the number of training sample input points.
A depth three-dimensional point cloud classification network construction method based on competitive attention fusion is disclosed, and a network framework is shown in figure 1. Wherein the competitive attention fusion feature abstraction layer structure is shown in fig. 2. Fig. 3 is a schematic diagram of an MFSE sub-module in a CAF module provided by the present invention. Fig. 4 is a schematic diagram of a FICSA sub-module in a CAF module provided in the present invention.
The method specifically comprises the following steps:
step 1: the method comprises the steps of preprocessing original point cloud data, parallelly batching B to 24 samples, preprocessing N to 10000 original point cloud data of each sample, and sampling in a down-sampling mode to obtain the original point cloud data containing N01024 point cloud data sampling results
Figure BDA0003001261620000051
Step 2: constructing a CAF module to form a competitive attention fusion feature abstraction Layer, wherein two layers of competitive attention fusion feature abstraction layers, i.e. Layer _1 and Layer _2, are respectively composed of two parts, firstly, the feature extraction Layer receives input data D from the competitive attention fusion feature abstraction LayerinIn Layer _1
Figure BDA0003001261620000052
The input in Layer _2 is the final output result of Layer _1, i.e. the output result is
Figure BDA0003001261620000053
Extracting high-dimensional features F of input data through multilayer convolution networkextIn Layer _1
Figure BDA0003001261620000054
In Layer _2
Figure BDA0003001261620000055
To input data DinAnd high dimensional feature FextThe two are taken as the input of a CAF module, and feature fusion is carried out in the CAF module;
the CAF module comprises an MFSE sub-module and a FICSA sub-module:
the MFSE submodule focuses on extraction and fusion of global features of different levels, and inputs input data D of the CAF submodule into the MFSE submodulein(in Layer _ 1)
Figure BDA0003001261620000061
In Layer _2
Figure BDA0003001261620000062
) And high dimensional feature Fext(in Layer _ 1)
Figure BDA0003001261620000063
In Layer _2
Figure BDA0003001261620000064
) Respectively performing pooling and encoding operations to obtain encoded features FMFSE-in(in Layer _ 1)
Figure BDA0003001261620000065
In Layer _2
Figure BDA0003001261620000066
) And FMFSE-ext(in Layer _ 1)
Figure BDA0003001261620000067
In Layer _2
Figure BDA0003001261620000068
) The formula is as follows:
Figure BDA0003001261620000069
wherein P (-) is a Max pooling function Max pooling of global feature aggregation,. phi (-) is a full connectivity layer and Relu activation function, and a channel scaling r 4 is used for adjusting the number of intermediate channels;
then, stacking the two coding features according to the channel direction to obtain a stacking result FMFSE-ConcatIn Layer _1
Figure BDA00030012616200000610
In Layer _2
Figure BDA00030012616200000611
The formula is as follows:
Figure BDA00030012616200000612
then, the channel number and the feature map size of the stacking result are expanded to be equal to the high-dimensional feature F through the full-connection layerextThe same dimension, using the feature as the output F of the MFSE submoduleMFSEThe formula is as follows:
Figure BDA00030012616200000613
wherein
Figure BDA00030012616200000614
For a full connection layer extension process with a normalization function Sigmoid, FMFSEThe global attention weight finally obtained for the MFSE submodule, in Layer _1
Figure BDA00030012616200000615
In Layer _2
Figure BDA00030012616200000616
The FICSA sub-module aims at measuring the intrinsic similarity of the features, and the FICSA sub-module inputs the high-dimensional features of the CAF module
Figure BDA00030012616200000617
Performing 1 × 1 point-to-point convolution operation, and linearly mapping the features of all channels of each point to three parallel high-dimensional features, wherein the formula is as follows:
Figure BDA00030012616200000618
wherein V (·), Q (·) and K (·) are three independent feature mapping functions respectively to obtain three corresponding advanced features, and the dimensions are N2×C2,wiAutomatically learning the different linear conversion coefficients in the training process, then carrying out similarity calculation through dot productThe operation obtains the correlation between Q (-) and K (-) and the formula is as follows:
Figure BDA00030012616200000619
wherein A (-) is a high-dimensional relation in the middle characteristic, gamma is a Softmax normalization function with aggregation function,
Figure BDA00030012616200000620
is a selectable channel scaling coefficient set for reducing the number of training parameters, and finally obtains a global attention weight F of the internal association of the characteristic points and the characteristic pointsFICSAThe formula is as follows:
FFICSA=γ(A(Fext)V(Fext)) (6)
wherein, V (-) is used for adjusting the feature channel dimension of A (-) and taking the feature as the final output of FICSA submodule
Figure BDA0003001261620000071
In Layer _1
Figure BDA0003001261620000072
In Layer _2
Figure BDA0003001261620000073
Finally, the CAF module outputs F of the MFSE submoduleMFSEAnd output F of FICSA submoduleFICSACompetitive weight fusion is carried out, residual learning is introduced, the weight of the characteristic channel is redistributed, and the formula is as follows:
FCAF=FMFSE+FFICSA (7)
through matrix addition, after the global attention weight is fused according to the proportion coefficient alpha being 1 and the beta being 1, the final weight distribution coefficient is obtained
Figure BDA0003001261620000074
In Layer _1
Figure BDA0003001261620000075
In Layer _2
Figure BDA0003001261620000076
Obtaining output of CAF module by weight redistribution and residual connection
Figure BDA0003001261620000077
FFusion=Fext+FCAFFext (8)
Output F of CAF moduleFusionI.e. the output of the competitive attention fusion feature abstraction Layer, in Layer _1
Figure BDA0003001261620000078
In Layer _2
Figure BDA0003001261620000079
And step 3: stacking two competitive attention fusion feature abstraction layers, namely Layer _1 and Layer _2, constructing a deep three-dimensional point cloud classification network, and sampling results in the step 1
Figure BDA00030012616200000710
As input, the first competitive attention fusion feature abstract Layer 1 is input to obtain the fused feature
Figure BDA00030012616200000711
Combining the fused features
Figure BDA00030012616200000712
Then as input, the input is sent to a second competitive attention fusion feature abstract Layer 2 to obtain the final fusion feature
Figure BDA00030012616200000713
And 4, step 4: and (3) sending the high-dimensional features finally output by the second Layer of competitive attention fusion feature abstraction Layer 2 into a classifier to obtain a classification result, introducing a multi-Layer perceptron (MLP) as the classifier after the competitive attention fusion feature abstraction Layer 2, wherein parameters of an MLP output channel are [256,512,1024,512,256 and 40], and carrying out classification learning on the input point cloud fusion features to obtain a classification score.
The experimental results are specifically as follows:
experiment 1: and (5) classifying the shapes. The CAF module is added into the Pointnet + +, the optimal classification precision is 90.7% when the Pointnet + + is reproduced, the average test precision reaches 91.0% after the CAF module is added, and the result proves the effectiveness and feasibility of the CAF module in maintaining and improving the classification precision. Adding a CAF module in the PointASNL, and when only coordinate points are input, enabling the classification precision to reach 92.9 percent (92.88 percent) and be not lower than 92.9 percent (92.85 percent of the actual test optimal classification precision) in the PointASNL; when the normal vector is added in training and testing, the classification precision reaches 93.2 percent (93.19 percent) and is not lower than 93.2 percent in PointASNL (the optimal classification precision is 93.15 percent in actual testing). The experimental result proves the independence and the mobility of the CAF module and helps to maintain the classification precision.
Experiment 2: and (5) carrying out robustness analysis.
Adding Gaussian noise simulation disturbance to the point cloud, and adopting standard normal distribution; random noise is added to the point cloud to simulate outliers, and the noise range is in the range of [ -1.0,1.0 ]. The ability of the CAF module to resist disturbances (gauss) and outliers (random) was tested using PointASNL as the reference network (Base). The result is shown in fig. 5, and after the CAF module is added, the anti-interference performance of the model on two noise types, namely point cloud disturbance and outlier, is obviously improved.
And replacing a certain number of real point sets with random noise within the range of [ -1.0,1.0], and simulating the situation of simultaneous data loss and noise interference, wherein the number of the random noise is [0,1,10,50,100 ]. Fig. 6(a) shows the classification accuracy of the pointent + + network added with the CAF module and the original network on the test set with data loss and random noise, and as the amount of noise increases, the network classification accuracy added with the CAF module decreases more slowly, and the robustness of the model is significantly improved. Fig. 6(b) shows the classification accuracy of the PointASNL network with the CAF module on the test set with data loss and random noise, and for the model with 1024 point training input, under the same condition, after the CAF module is added, the network anti-interference capability is improved under the condition of different amounts of data loss and random noise, and the better anti-interference capability can be obtained by adding the number of input point clouds to 2048 and 3000 while maintaining the stable classification performance. Fig. 6(c) shows the ultimate immunity of the CAF module to interference on PointASNL.
It should be noted that the above-mentioned embodiments are only examples for clearly illustrating the present invention, and are not limitations of the embodiments, and all embodiments cannot be exhaustive here. All parts not specified in the present embodiment can be realized by using the prior art. It will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims (5)

1. A depth three-dimensional point cloud classification network construction method based on competitive attention fusion is characterized by comprising the following steps:
step 1: preprocessing original point cloud data;
step 2: constructing a CAF module to form a competitive attention fusion feature abstraction layer;
and step 3: stacking two competitive attention fusion feature abstraction layers to construct a deep three-dimensional point cloud classification network;
and 4, step 4: and sending the high-dimensional features finally output by the second layer competitive attention fusion feature abstraction layer to a classifier to obtain a classification result.
2. The competitive attention fusion-based deep three-dimensional point cloud classification network construction method according to claim 1, characterized in that: in the step 1, in the process of preprocessing the original point cloud data, B samples are processed in parallel in batches, and N original point cloud data of each sample are preprocessed0Sampling result P of individual point cloud dataSample
3. The competitive attention fusion-based deep three-dimensional point cloud classification network construction method according to claim 1, characterized in that: the competitive attention fusion feature abstraction layer in the step 2 consists of a feature extraction layer and a CAF module, wherein the feature extraction layer receives input data D from the competitive attention fusion feature abstraction layerinExtracting high-dimensional characteristics F of input data through multilayer convolution networkextTo input data DinAnd high dimensional feature FextThe two are taken as the input of a CAF module, and feature fusion is carried out in the CAF module;
the CAF module comprises an MFSE sub-module and a FICSA sub-module:
the MFSE submodule focuses on extraction and fusion of global features of different levels, and the MFSE submodule inputs input data of the CAF submodule
Figure FDA0003001261610000011
And high dimensional features
Figure FDA0003001261610000012
Separately performing pooling and encoding operations, wherein
Figure FDA0003001261610000013
The number is a real number set,
Figure FDA0003001261610000014
representing a dimension N within a real number rangei×CiOf a two-dimensional matrix of, NiIs the point cloud number of the current stage sample, CiThe number of characteristic channels of the sample at the current stage, i is the serial number of 5 stages with different matrix dimensions, and the coded characteristic is obtained
Figure FDA0003001261610000015
(N31 is FMFSE-inPoint cloud number of C3=C1R is FMFSE-inCharacteristic number of channels) and
Figure FDA0003001261610000016
(N41 is FMFSE-extPoint cloud number of C4=C2R is FMFSE-extThe number of characteristic channels) as follows:
Figure FDA0003001261610000017
where P (-) is the Max pooling function of global feature aggregation Max pooling, φ (-) is the fully connected layer and Relu activation functions, and the channel scaling r is used to adjust the number of intermediate channels;
then, stacking the two coding features according to the channel direction to obtain a stacking result
Figure FDA0003001261610000018
N51 is FMFSE-ConcatPoint cloud number of C5=(C1+C2) R is FMFSE-ConcatThe formula of the characteristic channel number is as follows:
Figure FDA0003001261610000019
then, the channel number and the feature map size of the stacking result are expanded to be equal to the high-dimensional feature F through the full-connection layerextThe same dimension, using the feature as the output F of the MFSE submoduleMFSEThe formula is as follows:
Figure FDA0003001261610000021
wherein
Figure FDA0003001261610000022
For full-link layers containing a normalization function SigmoidIn the process of exhibition,
Figure FDA0003001261610000023
the global attention weight finally obtained by the MFSE submodule;
the FICSA sub-module aims at measuring the intrinsic similarity of the features, and the FICSA sub-module inputs the high-dimensional features of the CAF module
Figure FDA0003001261610000024
Performing 1 × 1 point-to-point convolution operation, and linearly mapping the features of all channels of each point to three parallel high-dimensional features, wherein the formula is as follows:
Figure FDA0003001261610000025
wherein V (·), Q (·) and K (·) are three independent feature mapping functions respectively to obtain three corresponding advanced features, and the dimensions are N2×C2,wiFor different linear transformation coefficients, subsequently, similarity calculation is carried out, and the correlation between Q (-) and K (-) is obtained through dot product operation, wherein the formula is as follows:
Figure FDA0003001261610000026
wherein A (-) is a high-dimensional relation in the middle characteristic, gamma is a Softmax normalization function with aggregation function,
Figure FDA0003001261610000027
is a selectable channel scaling coefficient set for reducing the number of training parameters, and finally obtains a global attention weight F of the internal association of the characteristic points and the characteristic pointsFICSAThe formula is as follows:
FFICSA=γ(A(Fext)V(Fext)) (6)
wherein, V (-) is used for adjusting the dimension of the characteristic channel of A (-) and using the characteristic as the FICSA sub-moldFinal output of block
Figure FDA0003001261610000028
Finally, the CAF module outputs F of the MFSE submoduleMFSEAnd output F of FICSA submoduleFICSACompetitive weight fusion is carried out, residual learning is introduced, the weight of the characteristic channel is redistributed, and the formula is as follows:
FCAF=αFMFSE+βFFICSA (7)
through matrix addition, after the global attention weight is fused according to different proportionality coefficients alpha and beta, the final weight distribution coefficient is obtained
Figure FDA0003001261610000029
Obtaining output of CAF module by weight redistribution and residual connection
Figure FDA00030012616100000210
FFusion=Fext+FCAFFext (8)
Output F of CAF moduleFusionI.e. the output of the competitive attention fusion feature abstraction layer.
4. The competitive attention fusion-based deep three-dimensional point cloud classification network construction method according to claim 1, characterized in that: the specific method for stacking two competitive attention fusion feature abstraction layers in the step 3 is as follows: the sampling result P in the step 1 is processedSampleSending the data as input into a first layer competitive attention fusion feature abstract layer to obtain a fused feature FFusion-Mid(ii) a The fused feature FFusion-MidThen as input, sending the data into a second layer competitive attention fusion feature abstract layer to obtain the final fusion feature FFusion-Final
5. The competitive attention fusion-based deep three-dimensional point cloud classification network construction method according to claim 1, characterized in that: the specific method for sending the high-dimensional features finally output by the second layer of competitive attention fusion feature abstraction layer to the classifier in the step 4 is to introduce a multilayer perceptron (MLP) as the classifier after the second layer of competitive attention fusion feature abstraction layer, and perform classification learning on the input point cloud fusion features to obtain the classification scores.
CN202110347537.5A 2021-03-31 2021-03-31 Deep three-dimensional point cloud classification network construction method based on competitive attention fusion Active CN112990336B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110347537.5A CN112990336B (en) 2021-03-31 2021-03-31 Deep three-dimensional point cloud classification network construction method based on competitive attention fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110347537.5A CN112990336B (en) 2021-03-31 2021-03-31 Deep three-dimensional point cloud classification network construction method based on competitive attention fusion

Publications (2)

Publication Number Publication Date
CN112990336A true CN112990336A (en) 2021-06-18
CN112990336B CN112990336B (en) 2024-03-26

Family

ID=76339112

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110347537.5A Active CN112990336B (en) 2021-03-31 2021-03-31 Deep three-dimensional point cloud classification network construction method based on competitive attention fusion

Country Status (1)

Country Link
CN (1) CN112990336B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117788962A (en) * 2024-02-27 2024-03-29 南京信息工程大学 Extensible point cloud target identification method and system based on continuous learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111242208A (en) * 2020-01-08 2020-06-05 深圳大学 Point cloud classification method, point cloud segmentation method and related equipment
CN112085123A (en) * 2020-09-25 2020-12-15 北方民族大学 Point cloud data classification and segmentation method based on salient point sampling

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111242208A (en) * 2020-01-08 2020-06-05 深圳大学 Point cloud classification method, point cloud segmentation method and related equipment
CN112085123A (en) * 2020-09-25 2020-12-15 北方民族大学 Point cloud data classification and segmentation method based on salient point sampling

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117788962A (en) * 2024-02-27 2024-03-29 南京信息工程大学 Extensible point cloud target identification method and system based on continuous learning
CN117788962B (en) * 2024-02-27 2024-05-10 南京信息工程大学 Extensible point cloud target identification method based on continuous learning

Also Published As

Publication number Publication date
CN112990336B (en) 2024-03-26

Similar Documents

Publication Publication Date Title
CN110287849B (en) Lightweight depth network image target detection method suitable for raspberry pi
CN112396607B (en) Deformable convolution fusion enhanced street view image semantic segmentation method
CN108171701B (en) Significance detection method based on U network and counterstudy
CN110390638B (en) High-resolution three-dimensional voxel model reconstruction method
CN111292330A (en) Image semantic segmentation method and device based on coder and decoder
CN108804397A (en) A method of the Chinese character style conversion based on a small amount of target font generates
CN107229757A (en) The video retrieval method encoded based on deep learning and Hash
CN111259904B (en) Semantic image segmentation method and system based on deep learning and clustering
CN113344188A (en) Lightweight neural network model based on channel attention module
CN108985177A (en) A kind of facial image classification method of the quick low-rank dictionary learning of combination sparse constraint
CN108710906A (en) Real-time point cloud model sorting technique based on lightweight network LightPointNet
CN113240683B (en) Attention mechanism-based lightweight semantic segmentation model construction method
CN109086405A (en) Remote sensing image retrieval method and system based on conspicuousness and convolutional neural networks
CN114648535A (en) Food image segmentation method and system based on dynamic transform
CN111652273A (en) Deep learning-based RGB-D image classification method
CN116310339A (en) Remote sensing image segmentation method based on matrix decomposition enhanced global features
CN112766283A (en) Two-phase flow pattern identification method based on multi-scale convolution network
CN110111365B (en) Training method and device based on deep learning and target tracking method and device
CN117237559A (en) Digital twin city-oriented three-dimensional model data intelligent analysis method and system
CN110728186A (en) Fire detection method based on multi-network fusion
CN113066089B (en) Real-time image semantic segmentation method based on attention guide mechanism
CN112990336A (en) Depth three-dimensional point cloud classification network construction method based on competitive attention fusion
CN104573726B (en) Facial image recognition method based on the quartering and each ingredient reconstructed error optimum combination
CN112668662B (en) Outdoor mountain forest environment target detection method based on improved YOLOv3 network
CN111639751A (en) Non-zero padding training method for binary convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant