CN111223120B - Point cloud semantic segmentation method - Google Patents

Point cloud semantic segmentation method Download PDF

Info

Publication number
CN111223120B
CN111223120B CN201911262240.8A CN201911262240A CN111223120B CN 111223120 B CN111223120 B CN 111223120B CN 201911262240 A CN201911262240 A CN 201911262240A CN 111223120 B CN111223120 B CN 111223120B
Authority
CN
China
Prior art keywords
point cloud
feature
dimensional
neural network
mlp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911262240.8A
Other languages
Chinese (zh)
Other versions
CN111223120A (en
Inventor
潘琳琳
孔慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN201911262240.8A priority Critical patent/CN111223120B/en
Publication of CN111223120A publication Critical patent/CN111223120A/en
Application granted granted Critical
Publication of CN111223120B publication Critical patent/CN111223120B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a point cloud semantic segmentation method, which is used for preprocessing point cloud data and comprises the operations of blocking, sampling, translation and normalization; designing a feature selection module, and establishing a neural network model based on feature selection; and training a neural network for subsequent point cloud semantic segmentation. The neural network structure comprises a feature selection module, semantic labels of each point are obtained through a training network, and the precision of point cloud semantic segmentation is improved. The neural network structure designed by the invention comprises the feature selection module, so that the feature channels of weak semantic information are restrained, the feature channels with key effects on the segmentation task are enhanced, and the segmentation precision is improved.

Description

Point cloud semantic segmentation method
Technical Field
The invention relates to a point cloud segmentation technology, in particular to a point cloud semantic segmentation method.
Background
The point cloud semantic segmentation is to divide the point cloud into semantically meaningful parts, and is an important research direction in the field of computer vision. At present, the point cloud semantic segmentation only stays on the feature extraction of the point cloud data, but the importance of each feature channel on the semantic segmentation task is ignored. Therefore, it is needed to design a network structure based on feature selection to suppress the feature channels of weak semantic information, enhance the feature channels having a key effect on the segmentation task, and improve the segmentation accuracy.
Disclosure of Invention
The invention aims to provide a point cloud semantic segmentation method.
The technical solution for realizing the purpose of the invention is as follows: a point cloud semantic segmentation method comprises the following steps:
step 1, preprocessing point cloud data, including blocking, sampling, translation and normalization operations;
step 2, designing a feature selection module, and establishing a neural network model based on feature selection;
and step 3, training a neural network for subsequent point cloud semantic segmentation.
Further, in step 1, the specific method for preprocessing the point cloud data is as follows:
firstly, dividing point cloud data into a plurality of cubes, randomly sampling in each block, discarding excessive points when the point number in the block is larger than a set threshold, and randomly picking points from the block to copy until the point number reaches the set threshold when the point number is smaller than the set threshold, so as to finish data sampling; the sampled point cloud data is a 6-dimensional vector comprising XYZ coordinate values and RGB color values, the point of the minimum coordinate value of the XYZ is taken as the origin of coordinates, the coordinate values of other points are correspondingly calculated by using a formula (1), and the data translation is completed to obtain X ', Y ' and Z '; normalizing X ', Y ', Z ' by using a formula (2), and adding 3-dimensional new coordinate values of X, Y and Z; normalizing RGB by a formula (3) to obtain normalized color values R ', G ', B ', and finally outputting processed 9-dimensional point cloud data (X ', Y ', Z ', X, Y, Z, R ', G ', B '):
X'=X-X min ;Y'=Y-Y min ;Z'=Z-Z min (1)
wherein X is min 、Y min 、Z min Is the minimum value of XYZ coordinate value, X max 、Y max 、Z max Is the maximum value of the XYZ coordinate values.
Further, in step 2, the specific method for establishing the neural network model based on feature selection is as follows:
(a) Neural network module design
Inputting the point cloud data into a neural network, obtaining a feature matrix after 5 layers of MLPs, and obtaining local features of the point cloud through a feature selection module; then obtaining global characteristics of the point cloud after maximum pooling and two layers of MLP; finally, splicing the local features and the global features, and obtaining the semantic category of each point through three-layer MLP operation;
(b) Feature selection module design
The feature selection includes a maximum pool, two layers of MLPs, an adder and a multiplier, and the input of the feature selection module is set as a feature matrix M 1 The feature selection module selects the feature matrix M 1 The weighting vector W is obtained by maximum pooling and two layers of MLP, and then W and M are calculated 1 Is multiplied by each row of vectors to obtain a feature matrix M 2 The method comprises the steps of carrying out a first treatment on the surface of the Finally M is arranged 1 And M 2 Adding matrix elements to obtain a feature matrix M 3 I.e. local features of the point cloud.
Further, in step 3, the specific method for training the neural network is as follows:
optimizing the cross entropy loss function of the network by using an ADAM algorithm with a driving quantity parameter of 0.9 and using a batch training network with a size of 32; in the training process, the learning is performed by adopting a variable learning rate, the initial learning rate is 0.001, and the learning rate is reduced to 0.5 times of the original learning rate every 20 training periods, so that the learning attenuation rate is gradually increased from 0.5 to 0.99.
Compared with the prior art, the invention has the remarkable advantages that: the designed neural network structure comprises a feature selection module, so that feature channels of weak semantic information are restrained, feature channels with key effects on segmentation tasks are enhanced, and segmentation accuracy is improved.
Drawings
FIG. 1 is a workflow diagram of a point cloud semantic segmentation system of the present invention.
FIG. 2 is a flow chart of the operation of the data processing module of the present invention.
Fig. 3 is a schematic structural diagram of a neural network module according to the present invention.
Fig. 4 is a schematic structural view of the feature selection module of the present invention.
Detailed Description
The present invention will be further described with reference to the drawings and the specific embodiments.
The invention designs a neural network structure, obtains the semantic label of each point through a training network, improves the precision of point cloud semantic segmentation, and comprises a data processing module and a neural network module, and comprises the following specific working steps:
step 1, a data processing module completes the preprocessing of point cloud data, which comprises four steps of blocking, sampling, translation and normalization, as shown in fig. 2, the specific flow is as follows:
firstly, dividing point cloud data into a plurality of cubes, then randomly sampling in each block, discarding excessive points when the point number in the block is larger than a set threshold value, and randomly picking points from the block to copy until the point number reaches the set threshold value when the point number is smaller than the set threshold value, thereby completing data sampling. The acquired point cloud data are 6-dimensional vectors comprising XYZ coordinate values and RGB color values, for training convenience, the coordinate values of other points are correspondingly calculated by using a formula (1) by taking the point of the minimum coordinate value of XYZ as the origin of coordinates, and data translation is completed to obtain X ', Y ' and Z '. In order to improve the segmentation accuracy, the formula (2) is used for normalizing X ', Y ' and Z ', and 3-dimensional new coordinate values X, Y and Z (0-1) are added. In addition, the RGB is normalized by the formula (3) to obtain normalized color values R ', G ', B ' (0-1), and finally the processed 9-dimensional point cloud data (X ', Y ', Z ', X, Y, Z, R ', G ', B ') are output:
X'=X-X min ;Y'=Y-Y min ;Z'=Z-Z min (1)
step 2, designing a feature selection module, and establishing a neural network model based on feature selection, as shown in fig. 3 and 4, wherein the specific flow is as follows:
(a) Neural network module design
As shown in FIG. 3, the point cloud data is input into a neural network, and a feature matrix M is obtained after 5 layers of MLPs 1 Then obtaining local characteristics of the point cloud through a characteristic selection module; then obtaining global characteristics of the point cloud after maximum pooling and two layers of MLP; and finally, splicing the local features and the global features, and obtaining the semantic category of each point through three-layer MLP operation.
(b) Feature selection module design
The feature selection module selects a useful feature channel by weighting the feature vector for each point. As shown in fig. 4, feature selection includes a maximum pool, a two-layer multi-layer perceptron (MLP), an adder, and a multiplier, assumingThe input of the feature selection module is a feature matrix M 1 The feature selection module selects the feature matrix M 1 The weighting vector W is obtained by maximum pooling and two layers of MLP, and then W and M are calculated 1 Is multiplied by each row of vectors to obtain a feature matrix M 2 The method comprises the steps of carrying out a first treatment on the surface of the Finally M is arranged 1 And M 2 Adding matrix elements to obtain a feature matrix M 3 I.e. local features of the point cloud.
Step 3, training a neural network for semantic segmentation of subsequent point cloud data;
optimizing the cross entropy loss function of the network by using an ADAM algorithm with a driving quantity parameter of 0.9 and using a batch training network with a size of 32; in the training process, the learning is performed by adopting a variable learning rate, the initial learning rate is 0.001, the learning rate is reduced to 0.5 times of the original learning rate every 20 training periods, and the learning attenuation rate is gradually increased from 0.5 to 0.99.
The invention designs a neural network structure comprising a feature selection module, semantic tags of each point are obtained through a training network, and the precision of point cloud semantic segmentation is improved.
Examples
In order to verify the effectiveness of the scheme of the invention, the indoor scene point cloud data set S3DIS (Stanford 3D Indoor Semantic Dataset) is used as experimental data, and the following simulation experiment is carried out to predict the semantic label of each point. The data set comprises scanning data of 271 rooms of 6 scenes, each point is marked with a semantic label, and the system comprises the following specific working steps:
and step 1, a point cloud data preprocessing module performs four operations of blocking, sampling, translation and normalization. Dividing the point cloud data into a plurality of cubes with the side length of 1 meter according to each room, randomly sampling 4096 points in each block, discarding the points when the point number in the block is larger than 4096, and randomly picking the points from the block to copy until the point number reaches the value when the point number is smaller than 4096, so as to finish sampling; translation and normalization of the data is then completed according to equations 1-3.
Step 2, constructing a neural network based on feature selection, which comprises the following steps:
a. neural network module design
Firstly, 4096×9-dimensional point cloud data is input into a neural network, and n×1024 feature matrix M is obtained through 5 layers of MLPs with sizes of 64, 128 and 1024 dimensions in sequence 1 For M 1 Feature selection is carried out to obtain a 4096×1024 feature matrix M 3 As a local feature of the point cloud; then to M 3 Obtaining a 1X 1024 feature vector by adopting maximum pooling operation, and obtaining the 1X 128 feature vector through one layer of 256-dimensional and one layer of 128-dimensional MLP respectively as a global feature of the point cloud; finally, the local features and the global features are spliced to obtain 4096×1152 feature matrixes, then a layer of 512-dimensional, a layer of 256-dimensional and a layer of 13-dimensional MLP (semantic segmentation class number) are respectively used for obtaining 4096×13 matrixes, semantic classes of each point are obtained, and semantic segmentation of the object is completed.
b. Feature selection module design
First to M 1 Obtaining a characteristic vector of 1 multiplied by 1024 by adopting maximum pooling operation, and obtaining a weighting vector W of 1 multiplied by 1024 by one layer of 128-dimensional and one layer of 1024-dimensional MLP respectively; then W and M 1 Is multiplied by the vector of each row to select useful characteristic channels to obtain a 4096×1024 characteristic matrix M 2 Then M is added with 1 And M 2 Adding matrix elements to obtain 4096×1024 characteristic matrix M 3 I.e. local features of the point cloud.
And step 3, training a neural network to obtain the semantic category of each point.
Optimizing a cross entropy loss function of the network by using an ADAM algorithm with a driving quantity parameter of 0.9, so that the loss is reduced to the minimum value of the network, and training the network by using a batch with a size of 32; according to the invention, the learning is performed by adopting a variable learning rate, the initial learning rate is 0.001, the learning rate is reduced to 0.5 times of the original learning rate in every 20 training periods, and the learning attenuation rate is gradually increased from 0.5 to 0.99; the neural network is implemented in a TensorFlow programming language. In this embodiment, experimental performance of FSNet in S3DIS data set is shown, that is, overall Accuracy (Overall Accuracy) and mean cross-over ratio (mlou) are observed, since S3DIS data set includes 6 areas, we use 6 fold cross-validation to perform experimental comparison, that is, one Area is selected as a test set each time, the remaining 5 areas are used as training sets, and finally, the arithmetic average of these 6 sets of experiments is calculated to obtain the Overall semantic segmentation result of S3DIS data set, as shown in table 1:
TABLE 1S 3DIS dataset Whole semantic segmentation results
As can be seen from Table 1, the OA value of PointNet is 79.1%, the mIoU value is 46.7%, and the obtained OA value is 1.1% higher than PointNet, and the mIoU value is 1.0% higher than PointNet, respectively. This is a good indication that mining the dependencies between feature channels is meaningful for semantic segmentation tasks.

Claims (1)

1. The point cloud semantic segmentation method is characterized by comprising the following steps of:
step 1, preprocessing point cloud data, including blocking, sampling, translation and normalization operations;
step 2, designing a feature selection module, and establishing a neural network model based on feature selection;
step 3, training a neural network for subsequent point cloud semantic segmentation;
in the step 1, the specific method for preprocessing the point cloud data comprises the following steps:
firstly dividing point cloud data into a plurality of cubes, randomly sampling in each block, discarding the excessive points when the point number in the block is larger than a set threshold, copying the excessive points from the block until the point number reaches the set threshold when the point number is smaller than the set threshold, and completing data sampling;
the sampled point cloud data is a 6-dimensional vector comprising XYZ coordinate values and RGB color values, the point of the minimum coordinate value of the XYZ is taken as the origin of coordinates, the coordinate values of other points are correspondingly calculated by using a formula (1), and the data translation is completed to obtain X ', Y ' and Z '; normalizing X ', Y ', Z ' by using a formula (2), and adding 3-dimensional new coordinate values of X, Y and Z; normalizing RGB by a formula (3) to obtain normalized color values R ', G ', B ', and finally outputting processed 9-dimensional point cloud data (X ', Y ', Z ', X, Y, Z, R ', G ', B '):
X'=X-X min ;Y'=Y-Y min ;Z'=Z-Z min (1)
wherein X is min 、Y min 、Z min Is the minimum value of XYZ coordinate value, X max 、Y max 、Z max Is the maximum value of XYZ coordinate values;
in step 2, the specific method for establishing the neural network model based on feature selection comprises the following steps:
(a) Neural network module design
Inputting the point cloud data into a neural network, obtaining a feature matrix after 5 layers of MLPs, and obtaining local features of the point cloud through a feature selection module; then obtaining global characteristics of the point cloud after maximum pooling and two layers of MLP; finally, splicing the local features and the global features, and obtaining semantic categories of each point through three-layer MLP operation, namely inputting 4096×9-dimensional point cloud data into a neural network, and obtaining an n×1024 feature matrix M through 5-layer MLP with the sizes of 64, 128 and 1024 in sequence 1 For M 1 Feature selection is carried out to obtain a 4096×1024 feature matrix M 3 As a local feature of the point cloud; then to M 3 The maximum pooling operation is adopted to obtain 1X 1024 feature vectors, and the feature vectors are respectively obtained through one layer of 256-dimensional MLP and one layer of 128-dimensional MLPA feature vector of 1×128 as a global feature of the point cloud; finally, splicing the local features and the global features to obtain 4096×1152 feature matrixes, respectively obtaining 4096×13 matrixes through one layer of 512-dimensional MLP, one layer of 256-dimensional MLP and one layer of 13-dimensional MLP to obtain semantic categories of each point, and completing semantic segmentation of objects;
(b) Feature selection module design
The feature selection includes a maximum pool, two layers of MLPs, an adder and a multiplier, and the input of the feature selection module is set as a feature matrix M 1 The feature selection module selects the feature matrix M 1 The weighting vector W is obtained by maximum pooling and two layers of MLP, and then W and M are calculated 1 Is multiplied by each row of vectors to obtain a feature matrix M 2 The method comprises the steps of carrying out a first treatment on the surface of the Finally M is arranged 1 And M 2 Adding matrix elements to obtain a feature matrix M 3 Local features of point clouds, i.e. in particular for M 1 Obtaining a characteristic vector of 1 multiplied by 1024 by adopting maximum pooling operation, and obtaining a weighting vector W of 1 multiplied by 1024 by one layer of 128-dimensional and one layer of 1024-dimensional MLP respectively; then W and M 1 Is multiplied by the vector of each row to select useful characteristic channels to obtain a 4096×1024 characteristic matrix M 2 Then M is added with 1 And M 2 Adding matrix elements to obtain 4096×1024 characteristic matrix M 3 Namely, the local characteristics of the point cloud;
in step 3, the specific method for training the neural network is as follows:
optimizing the cross entropy loss function of the network by using an ADAM algorithm with a driving quantity parameter of 0.9 and using a batch training network with a size of 32; in the training process, the learning is performed by adopting a variable learning rate, the initial learning rate is 0.001, and the learning rate is reduced to 0.5 times of the original learning rate every 20 training periods, so that the learning attenuation rate is gradually increased from 0.5 to 0.99.
CN201911262240.8A 2019-12-10 2019-12-10 Point cloud semantic segmentation method Active CN111223120B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911262240.8A CN111223120B (en) 2019-12-10 2019-12-10 Point cloud semantic segmentation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911262240.8A CN111223120B (en) 2019-12-10 2019-12-10 Point cloud semantic segmentation method

Publications (2)

Publication Number Publication Date
CN111223120A CN111223120A (en) 2020-06-02
CN111223120B true CN111223120B (en) 2023-08-04

Family

ID=70830735

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911262240.8A Active CN111223120B (en) 2019-12-10 2019-12-10 Point cloud semantic segmentation method

Country Status (1)

Country Link
CN (1) CN111223120B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11004202B2 (en) * 2017-10-09 2021-05-11 The Board Of Trustees Of The Leland Stanford Junior University Systems and methods for semantic segmentation of 3D point clouds
CN109523552B (en) * 2018-10-24 2021-11-02 青岛智能产业技术研究院 Three-dimensional object detection method based on viewing cone point cloud
CN110322453B (en) * 2019-07-05 2023-04-18 西安电子科技大学 3D point cloud semantic segmentation method based on position attention and auxiliary network

Also Published As

Publication number Publication date
CN111223120A (en) 2020-06-02

Similar Documents

Publication Publication Date Title
CN112750140B (en) Information mining-based disguised target image segmentation method
CN110660062B (en) Point cloud instance segmentation method and system based on PointNet
CN113012185B (en) Image processing method, device, computer equipment and storage medium
CN112288011B (en) Image matching method based on self-attention deep neural network
CN109964250A (en) For analyzing the method and system of the image in convolutional neural networks
CN107564009B (en) Outdoor scene multi-target segmentation method based on deep convolutional neural network
CN111832437A (en) Building drawing identification method, electronic equipment and related product
CN110222718B (en) Image processing method and device
US20190114532A1 (en) Apparatus and method for convolution operation of convolution neural network
CN112101364B (en) Semantic segmentation method based on parameter importance increment learning
Liu et al. Image de-hazing from the perspective of noise filtering
CN107679539B (en) Single convolution neural network local information and global information integration method based on local perception field
CN111339862B (en) Remote sensing scene classification method and device based on channel attention mechanism
CN109934272B (en) Image matching method based on full convolution network
WO2020151148A1 (en) Neural network-based black-and-white photograph color restoration method, apparatus, and storage medium
CN111400572A (en) Content safety monitoring system and method for realizing image feature recognition based on convolutional neural network
CN115147488B (en) Workpiece pose estimation method and grabbing system based on dense prediction
CN111008631A (en) Image association method and device, storage medium and electronic device
JP2015036939A (en) Feature extraction program and information processing apparatus
CN110807379A (en) Semantic recognition method and device and computer storage medium
CN111368637B (en) Transfer robot target identification method based on multi-mask convolutional neural network
CN116721460A (en) Gesture recognition method, gesture recognition device, electronic equipment and storage medium
CN114882278A (en) Tire pattern classification method and device based on attention mechanism and transfer learning
CN111523561A (en) Image style recognition method and device, computer equipment and storage medium
CN114049491A (en) Fingerprint segmentation model training method, fingerprint segmentation device, fingerprint segmentation equipment and fingerprint segmentation medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant