CN111223120A

CN111223120A - Point cloud semantic segmentation method

Info

Publication number: CN111223120A
Application number: CN201911262240.8A
Authority: CN
Inventors: 潘琳琳; 孔慧
Original assignee: Nanjing University of Science and Technology
Current assignee: Nanjing University of Science and Technology
Priority date: 2019-12-10
Filing date: 2019-12-10
Publication date: 2020-06-02
Anticipated expiration: 2039-12-10
Also published as: CN111223120B

Abstract

The invention discloses a point cloud semantic segmentation method, which is used for preprocessing point cloud data and comprises the operations of blocking, sampling, translation and normalization; designing a feature selection module, and establishing a neural network model based on feature selection; and training a neural network for semantic segmentation of subsequent point clouds. The neural network structure designed by the invention comprises a feature selection module, and the semantic labels of all points are obtained through a training network, so that the precision of point cloud semantic segmentation is improved. The neural network structure designed by the invention comprises the feature selection module, so that the feature channel of weak semantic information is inhibited, the feature channel which plays a key role in the segmentation task is enhanced, and the segmentation precision is improved.

Description

Point cloud semantic segmentation method

Technical Field

The invention relates to a point cloud segmentation technology, in particular to a point cloud semantic segmentation method.

Background

Point cloud semantic segmentation is to divide point cloud into semantically meaningful parts, and is an important research direction in the field of computer vision. At present, point cloud semantic segmentation only stays on feature extraction of point cloud data, but neglects to study different importance of each feature channel to semantic segmentation tasks. Therefore, it is necessary to design a network structure based on feature selection to suppress feature channels of weak semantic information, enhance feature channels having a key role in a segmentation task, and improve segmentation accuracy.

Disclosure of Invention

The invention aims to provide a point cloud semantic segmentation method.

The technical solution for realizing the purpose of the invention is as follows: a point cloud semantic segmentation method comprises the following steps:

step 1, point cloud data preprocessing is carried out, wherein the point cloud data preprocessing comprises operations of blocking, sampling, translation and normalization;

step 2, designing a feature selection module, and establishing a neural network model based on feature selection;

and 3, training a neural network for semantic segmentation of subsequent point clouds.

Further, in step 1, a specific method for point cloud data preprocessing is as follows:

firstly, point cloud data is divided into a plurality of cubic blocks, then random sampling is carried out in each block, when the point number in each block is larger than a set threshold value, excessive points are discarded, when the point number is smaller than the set threshold value, points are randomly picked from the blocks to be copied until the point number reaches the set threshold value, and data sampling is finished; the sampled point cloud data is a 6-dimensional vector comprising XYZ coordinate values and RGB color values, a point with the minimum coordinate value of XYZ is taken as a coordinate origin, coordinate values of other points are correspondingly calculated by using a formula (1), data translation is completed, and X ', Y ' and Z ' are obtained; normalizing the X ', Y ' and Z ' by a formula (2), and adding 3-dimensional new coordinate values X, Y and Z; normalizing RGB by using a formula (3) to obtain normalized color values R ', G ' and B ', and finally outputting the processed 9-dimensional point cloud data (X ', Y ', Z ', X, Y, Z, R ', G ' and B '):

X'＝X-X_min；Y'＝Y-Y_min；Z'＝Z-Z_min(1)

in the formula, X_min、Y_min、Z_minAs XYZ coordinate valuesMinimum value, X_max、Y_max、Z_maxThe maximum value of the XYZ coordinate values.

Further, in step 2, a specific method for establishing the neural network model based on the feature selection is as follows:

(a) neural network module design

Inputting point cloud data into a neural network, obtaining a characteristic matrix after 5 layers of MLPs, and obtaining local characteristics of the point cloud through a characteristic selection module; then, after maximum pooling and two-layer MLP, obtaining the global characteristics of the point cloud; finally, the local features and the global features are spliced together, and the semantic category of each point is obtained through three-layer MLP operation;

(b) feature selection module design

The feature selection comprises a maximum pool, two layers of MLPs, an adder and a multiplier, and the input of the feature selection module is set as a feature matrix M₁The feature selection module selects a feature matrix M₁Obtaining a weight vector W through maximum pooling and two-layer MLP, and then adding W to M₁Multiplying each row vector to obtain a feature matrix M₂(ii) a Finally, M is₁And M₂Adding matrix elements to obtain a feature matrix M₃I.e. local features of the point cloud.

Further, in step 3, the specific method for training the neural network is as follows:

optimizing the cross entropy loss function of the network by using a cross entropy loss function and an ADAM algorithm with a momentum parameter of 0.9, and training the network by using a batch with the size of 32; in the training process, the learning rate is changed, the initial learning rate is 0.001, and the learning rate is reduced to 0.5 time of the original learning rate every 20 training cycles, so that the learning attenuation rate is gradually increased to 0.99 from 0.5.

Compared with the prior art, the invention has the following remarkable advantages: the designed neural network structure comprises a feature selection module, so that a feature channel of weak semantic information is inhibited, the feature channel having a key effect on a segmentation task is enhanced, and the segmentation precision is improved.

Drawings

FIG. 1 is a flow chart of the operation of the point cloud semantic segmentation system of the present invention.

FIG. 2 is a flow chart of the operation of the data processing module of the present invention.

Fig. 3 is a schematic structural diagram of a neural network module according to the present invention.

Fig. 4 is a schematic structural diagram of a feature selection module according to the present invention.

Detailed Description

The invention is further illustrated by the following examples in conjunction with the accompanying drawings.

The invention designs a neural network structure, obtains semantic labels of each point through a training network, improves the precision of point cloud semantic segmentation, comprises a data processing module and a neural network module, and specifically comprises the following working steps:

step 1, the data processing module completes point cloud data preprocessing, which comprises four steps of blocking, sampling, translation and normalization, and as shown in fig. 2, the specific flow is as follows:

firstly, point cloud data is divided into a plurality of cubic blocks, then random sampling is carried out in each block, when the point number in each block is larger than a set threshold value, excessive points are discarded, when the point number is smaller than the set threshold value, points are randomly picked from the blocks to be copied until the point number reaches the set threshold value, and data sampling is finished. The collected point cloud data is a 6-dimensional vector comprising XYZ coordinate values and RGB color values, for the convenience of training, the point with the minimum coordinate value of XYZ is taken as the coordinate origin, the coordinate values of other points are correspondingly calculated by using a formula (1), and data translation is completed to obtain X ', Y ' and Z '. In order to improve the segmentation accuracy, X ', Y ', and Z ' are normalized by formula (2), and 3-dimensional new coordinate values X, Y, and Z (0-1) are added. The RGB is normalized by the formula (3) to obtain normalized color values R ', G ', B ' (0-1), and finally, the processed 9-dimensional point cloud data (X ', Y ', Z ', X, Y, Z, R ', G ', B ') are output:

X'＝X-X_min；Y'＝Y-Y_min；Z'＝Z-Z_min(1)

step 2, designing a feature selection module, and establishing a neural network model based on feature selection, as shown in fig. 3 and 4, the specific process is as follows:

(a) neural network module design

As shown in FIG. 3, the point cloud data is input into a neural network, and a feature matrix M is obtained after 5 layers of MLPs₁Then, local features of the point cloud are obtained through a feature selection module; then, after maximum pooling and two-layer MLP, obtaining the global characteristics of the point cloud; and finally, splicing the local features and the global features, and obtaining the semantic category of each point through three-layer MLP operation.

(b) Feature selection module design

The feature selection module selects useful feature channels by weighting the feature vectors of each point. As shown in FIG. 4, the feature selection includes a max pool, a two-level multi-layer perceptron (MLP), an adder, and a multiplier, assuming the input of the feature selection module is the feature matrix M₁The feature selection module selects a feature matrix M₁Obtaining a weight vector W through maximum pooling and two-layer MLP, and then adding W to M₁Multiplying each row vector to obtain a feature matrix M₂(ii) a Finally, M is₁And M₂Adding matrix elements to obtain a feature matrix M₃I.e. local features of the point cloud.

Step 3, training a neural network for semantic segmentation of subsequent point cloud data;

optimizing the cross entropy loss function of the network by using a cross entropy loss function and an ADAM algorithm with a momentum parameter of 0.9, and training the network by using a batch with the size of 32; in the training process, the learning rate is changed, the initial learning rate is 0.001, the learning rate is reduced to 0.5 time per 20 training cycles, and the learning attenuation rate is gradually increased to 0.99 from 0.5.

The invention designs a neural network structure comprising a feature selection module, obtains the semantic label of each point through a training network, and improves the precision of point cloud semantic segmentation.

Examples

In order to verify the effectiveness of the scheme of the invention, an indoor scene point cloud data set S3DIS (Stanford 3DIndoor Semantic Dataset) is used as experimental data, and the following simulation experiment is carried out to predict the Semantic label of each point. The data set comprises scanning data of 271 rooms of 6 scenes, each point is labeled with a semantic label, and the specific working steps of the system are as follows:

step 1, a point cloud data preprocessing module carries out four operations of blocking, sampling, translation and normalization. Dividing point cloud data into a plurality of cubic blocks with the side length of 1 meter according to each room, randomly sampling 4096 points in each block, discarding excessive points when the number of points in each block is greater than 4096, and randomly selecting points from each block to copy until the number of points reaches the value when the number of points is less than 4096, thereby completing sampling; the translation and normalization of the data is then done according to equations 1-3.

Step 2, constructing a neural network based on feature selection, which is specifically as follows:

a. neural network module design

Firstly, 4096 x 9-dimensional point cloud data is input into a neural network, and an n x 1024 feature matrix M is obtained by 5 layers of MLPs with the sizes of 64, 128 and 1024 in sequence₁To M₁Obtaining 4096 multiplied 1024 characteristic matrix M after characteristic selection₃As local features of the point cloud; then to M₃Obtaining 1 multiplied by 1024 characteristic vectors by adopting maximum pooling operation, and obtaining 1 multiplied by 128 characteristic vectors as global characteristics of the point cloud through a layer of 256 dimension MLP and a layer of 128 dimension MLP respectively; finally, the local features and the global features are spliced to obtain a 4096 x 1152 feature matrix, and then a 4096 x 13 matrix is obtained through one layer of 512-dimensional, one layer of 256-dimensional and one layer of 13-dimensional (semantic segmentation class number) MLP respectively, so that the semantic class of each point is obtained, and the semantic segmentation of the object is completed.

b. Feature selection module design

Firstly to M₁Obtaining a 1 x 1024 feature vector by adopting maximum pooling operation, and obtaining a 1 x 1024 weighting vector W through a layer of 128-dimensional MLP and a layer of 1024-dimensional MLP respectively; is connected withThen W and M are₁Multiplying each row vector to select useful feature channel to obtain 4096 × 1024 feature matrix M₂Then M is added₁And M₂Matrix element addition is carried out to obtain 4096 multiplied by 1024 characteristic matrix M₃I.e. local features of the point cloud.

And 3, training the neural network to obtain the semantic category of each point.

Optimizing a cross entropy loss function of the network by using an ADAM algorithm with a momentum parameter of 0.9, so that the loss is reduced to the minimum value of the network, and training the network by using a batch with the size of 32; the invention adopts the changed learning rate for learning, the initial learning rate is 0.001, the learning rate is reduced to 0.5 time per 20 training periods, and the learning attenuation rate is gradually increased to 0.99 from 0.5; the neural network is implemented using the TensorFlow programming language. In this embodiment, the experimental performance of FSNet in the S3DIS data set is shown, that is, the Overall Accuracy (Overall Accuracy) and the mean value intersection ratio (mlou) are observed, because the S3DIS data set includes 6 areas, we adopt a 6-fold cross validation mode to perform experimental comparison, that is, one Area is picked out as a test set each time, the other 5 areas are taken as a training set, and finally, the arithmetic mean value of the 6 groups of experiments is obtained to obtain the Overall semantic segmentation result of the S3DIS data set, as shown in table 1:

TABLE 1S 3DIS data set Overall semantic segmentation results

It can be seen from table 1 that the OA value of PointNet is 79.1%, the mlio u value is 46.7%, the obtained OA value is 1.1% higher than that of PointNet, and the mlio u value is 1.0% higher than that of PointNet respectively. This is sufficient to show that mining dependencies between feature channels is meaningful for semantic segmentation tasks.

Claims

1. A point cloud semantic segmentation method is characterized by comprising the following steps:

2. The point cloud semantic segmentation method according to claim 1, wherein in step 1, the point cloud data is preprocessed by a specific method comprising:

X'＝X-X_min；Y'＝Y-Y_min；Z'＝Z-Z_min(1)

in the formula, X_min、Y_min、Z_minIs the minimum of XYZ coordinate values, X_max、Y_max、Z_maxThe maximum value of the XYZ coordinate values.

3. The point cloud semantic segmentation method according to claim 1, wherein in the step 2, a specific method for establishing the neural network model based on feature selection is as follows:

(a) neural network module design

(b) feature selection module design

4. The point cloud semantic segmentation according to claim 1, wherein in the step 3, a concrete method for training a neural network is as follows: