CN111223120A - Point cloud semantic segmentation method - Google Patents

Point cloud semantic segmentation method Download PDF

Info

Publication number
CN111223120A
CN111223120A CN201911262240.8A CN201911262240A CN111223120A CN 111223120 A CN111223120 A CN 111223120A CN 201911262240 A CN201911262240 A CN 201911262240A CN 111223120 A CN111223120 A CN 111223120A
Authority
CN
China
Prior art keywords
point cloud
neural network
feature selection
feature
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911262240.8A
Other languages
Chinese (zh)
Other versions
CN111223120B (en
Inventor
潘琳琳
孔慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN201911262240.8A priority Critical patent/CN111223120B/en
Publication of CN111223120A publication Critical patent/CN111223120A/en
Application granted granted Critical
Publication of CN111223120B publication Critical patent/CN111223120B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a point cloud semantic segmentation method, which is used for preprocessing point cloud data and comprises the operations of blocking, sampling, translation and normalization; designing a feature selection module, and establishing a neural network model based on feature selection; and training a neural network for semantic segmentation of subsequent point clouds. The neural network structure designed by the invention comprises a feature selection module, and the semantic labels of all points are obtained through a training network, so that the precision of point cloud semantic segmentation is improved. The neural network structure designed by the invention comprises the feature selection module, so that the feature channel of weak semantic information is inhibited, the feature channel which plays a key role in the segmentation task is enhanced, and the segmentation precision is improved.

Description

Point cloud semantic segmentation method
Technical Field
The invention relates to a point cloud segmentation technology, in particular to a point cloud semantic segmentation method.
Background
Point cloud semantic segmentation is to divide point cloud into semantically meaningful parts, and is an important research direction in the field of computer vision. At present, point cloud semantic segmentation only stays on feature extraction of point cloud data, but neglects to study different importance of each feature channel to semantic segmentation tasks. Therefore, it is necessary to design a network structure based on feature selection to suppress feature channels of weak semantic information, enhance feature channels having a key role in a segmentation task, and improve segmentation accuracy.
Disclosure of Invention
The invention aims to provide a point cloud semantic segmentation method.
The technical solution for realizing the purpose of the invention is as follows: a point cloud semantic segmentation method comprises the following steps:
step 1, point cloud data preprocessing is carried out, wherein the point cloud data preprocessing comprises operations of blocking, sampling, translation and normalization;
step 2, designing a feature selection module, and establishing a neural network model based on feature selection;
and 3, training a neural network for semantic segmentation of subsequent point clouds.
Further, in step 1, a specific method for point cloud data preprocessing is as follows:
firstly, point cloud data is divided into a plurality of cubic blocks, then random sampling is carried out in each block, when the point number in each block is larger than a set threshold value, excessive points are discarded, when the point number is smaller than the set threshold value, points are randomly picked from the blocks to be copied until the point number reaches the set threshold value, and data sampling is finished; the sampled point cloud data is a 6-dimensional vector comprising XYZ coordinate values and RGB color values, a point with the minimum coordinate value of XYZ is taken as a coordinate origin, coordinate values of other points are correspondingly calculated by using a formula (1), data translation is completed, and X ', Y ' and Z ' are obtained; normalizing the X ', Y ' and Z ' by a formula (2), and adding 3-dimensional new coordinate values X, Y and Z; normalizing RGB by using a formula (3) to obtain normalized color values R ', G ' and B ', and finally outputting the processed 9-dimensional point cloud data (X ', Y ', Z ', X, Y, Z, R ', G ' and B '):
X'=X-Xmin;Y'=Y-Ymin;Z'=Z-Zmin(1)
Figure BDA0002311887950000011
Figure BDA0002311887950000012
in the formula, Xmin、Ymin、ZminAs XYZ coordinate valuesMinimum value, Xmax、Ymax、ZmaxThe maximum value of the XYZ coordinate values.
Further, in step 2, a specific method for establishing the neural network model based on the feature selection is as follows:
(a) neural network module design
Inputting point cloud data into a neural network, obtaining a characteristic matrix after 5 layers of MLPs, and obtaining local characteristics of the point cloud through a characteristic selection module; then, after maximum pooling and two-layer MLP, obtaining the global characteristics of the point cloud; finally, the local features and the global features are spliced together, and the semantic category of each point is obtained through three-layer MLP operation;
(b) feature selection module design
The feature selection comprises a maximum pool, two layers of MLPs, an adder and a multiplier, and the input of the feature selection module is set as a feature matrix M1The feature selection module selects a feature matrix M1Obtaining a weight vector W through maximum pooling and two-layer MLP, and then adding W to M1Multiplying each row vector to obtain a feature matrix M2(ii) a Finally, M is1And M2Adding matrix elements to obtain a feature matrix M3I.e. local features of the point cloud.
Further, in step 3, the specific method for training the neural network is as follows:
optimizing the cross entropy loss function of the network by using a cross entropy loss function and an ADAM algorithm with a momentum parameter of 0.9, and training the network by using a batch with the size of 32; in the training process, the learning rate is changed, the initial learning rate is 0.001, and the learning rate is reduced to 0.5 time of the original learning rate every 20 training cycles, so that the learning attenuation rate is gradually increased to 0.99 from 0.5.
Compared with the prior art, the invention has the following remarkable advantages: the designed neural network structure comprises a feature selection module, so that a feature channel of weak semantic information is inhibited, the feature channel having a key effect on a segmentation task is enhanced, and the segmentation precision is improved.
Drawings
FIG. 1 is a flow chart of the operation of the point cloud semantic segmentation system of the present invention.
FIG. 2 is a flow chart of the operation of the data processing module of the present invention.
Fig. 3 is a schematic structural diagram of a neural network module according to the present invention.
Fig. 4 is a schematic structural diagram of a feature selection module according to the present invention.
Detailed Description
The invention is further illustrated by the following examples in conjunction with the accompanying drawings.
The invention designs a neural network structure, obtains semantic labels of each point through a training network, improves the precision of point cloud semantic segmentation, comprises a data processing module and a neural network module, and specifically comprises the following working steps:
step 1, the data processing module completes point cloud data preprocessing, which comprises four steps of blocking, sampling, translation and normalization, and as shown in fig. 2, the specific flow is as follows:
firstly, point cloud data is divided into a plurality of cubic blocks, then random sampling is carried out in each block, when the point number in each block is larger than a set threshold value, excessive points are discarded, when the point number is smaller than the set threshold value, points are randomly picked from the blocks to be copied until the point number reaches the set threshold value, and data sampling is finished. The collected point cloud data is a 6-dimensional vector comprising XYZ coordinate values and RGB color values, for the convenience of training, the point with the minimum coordinate value of XYZ is taken as the coordinate origin, the coordinate values of other points are correspondingly calculated by using a formula (1), and data translation is completed to obtain X ', Y ' and Z '. In order to improve the segmentation accuracy, X ', Y ', and Z ' are normalized by formula (2), and 3-dimensional new coordinate values X, Y, and Z (0-1) are added. The RGB is normalized by the formula (3) to obtain normalized color values R ', G ', B ' (0-1), and finally, the processed 9-dimensional point cloud data (X ', Y ', Z ', X, Y, Z, R ', G ', B ') are output:
X'=X-Xmin;Y'=Y-Ymin;Z'=Z-Zmin(1)
Figure BDA0002311887950000031
Figure BDA0002311887950000032
step 2, designing a feature selection module, and establishing a neural network model based on feature selection, as shown in fig. 3 and 4, the specific process is as follows:
(a) neural network module design
As shown in FIG. 3, the point cloud data is input into a neural network, and a feature matrix M is obtained after 5 layers of MLPs1Then, local features of the point cloud are obtained through a feature selection module; then, after maximum pooling and two-layer MLP, obtaining the global characteristics of the point cloud; and finally, splicing the local features and the global features, and obtaining the semantic category of each point through three-layer MLP operation.
(b) Feature selection module design
The feature selection module selects useful feature channels by weighting the feature vectors of each point. As shown in FIG. 4, the feature selection includes a max pool, a two-level multi-layer perceptron (MLP), an adder, and a multiplier, assuming the input of the feature selection module is the feature matrix M1The feature selection module selects a feature matrix M1Obtaining a weight vector W through maximum pooling and two-layer MLP, and then adding W to M1Multiplying each row vector to obtain a feature matrix M2(ii) a Finally, M is1And M2Adding matrix elements to obtain a feature matrix M3I.e. local features of the point cloud.
Step 3, training a neural network for semantic segmentation of subsequent point cloud data;
optimizing the cross entropy loss function of the network by using a cross entropy loss function and an ADAM algorithm with a momentum parameter of 0.9, and training the network by using a batch with the size of 32; in the training process, the learning rate is changed, the initial learning rate is 0.001, the learning rate is reduced to 0.5 time per 20 training cycles, and the learning attenuation rate is gradually increased to 0.99 from 0.5.
The invention designs a neural network structure comprising a feature selection module, obtains the semantic label of each point through a training network, and improves the precision of point cloud semantic segmentation.
Examples
In order to verify the effectiveness of the scheme of the invention, an indoor scene point cloud data set S3DIS (Stanford 3DIndoor Semantic Dataset) is used as experimental data, and the following simulation experiment is carried out to predict the Semantic label of each point. The data set comprises scanning data of 271 rooms of 6 scenes, each point is labeled with a semantic label, and the specific working steps of the system are as follows:
step 1, a point cloud data preprocessing module carries out four operations of blocking, sampling, translation and normalization. Dividing point cloud data into a plurality of cubic blocks with the side length of 1 meter according to each room, randomly sampling 4096 points in each block, discarding excessive points when the number of points in each block is greater than 4096, and randomly selecting points from each block to copy until the number of points reaches the value when the number of points is less than 4096, thereby completing sampling; the translation and normalization of the data is then done according to equations 1-3.
Step 2, constructing a neural network based on feature selection, which is specifically as follows:
a. neural network module design
Firstly, 4096 x 9-dimensional point cloud data is input into a neural network, and an n x 1024 feature matrix M is obtained by 5 layers of MLPs with the sizes of 64, 128 and 1024 in sequence1To M1Obtaining 4096 multiplied 1024 characteristic matrix M after characteristic selection3As local features of the point cloud; then to M3Obtaining 1 multiplied by 1024 characteristic vectors by adopting maximum pooling operation, and obtaining 1 multiplied by 128 characteristic vectors as global characteristics of the point cloud through a layer of 256 dimension MLP and a layer of 128 dimension MLP respectively; finally, the local features and the global features are spliced to obtain a 4096 x 1152 feature matrix, and then a 4096 x 13 matrix is obtained through one layer of 512-dimensional, one layer of 256-dimensional and one layer of 13-dimensional (semantic segmentation class number) MLP respectively, so that the semantic class of each point is obtained, and the semantic segmentation of the object is completed.
b. Feature selection module design
Firstly to M1Obtaining a 1 x 1024 feature vector by adopting maximum pooling operation, and obtaining a 1 x 1024 weighting vector W through a layer of 128-dimensional MLP and a layer of 1024-dimensional MLP respectively; is connected withThen W and M are1Multiplying each row vector to select useful feature channel to obtain 4096 × 1024 feature matrix M2Then M is added1And M2Matrix element addition is carried out to obtain 4096 multiplied by 1024 characteristic matrix M3I.e. local features of the point cloud.
And 3, training the neural network to obtain the semantic category of each point.
Optimizing a cross entropy loss function of the network by using an ADAM algorithm with a momentum parameter of 0.9, so that the loss is reduced to the minimum value of the network, and training the network by using a batch with the size of 32; the invention adopts the changed learning rate for learning, the initial learning rate is 0.001, the learning rate is reduced to 0.5 time per 20 training periods, and the learning attenuation rate is gradually increased to 0.99 from 0.5; the neural network is implemented using the TensorFlow programming language. In this embodiment, the experimental performance of FSNet in the S3DIS data set is shown, that is, the Overall Accuracy (Overall Accuracy) and the mean value intersection ratio (mlou) are observed, because the S3DIS data set includes 6 areas, we adopt a 6-fold cross validation mode to perform experimental comparison, that is, one Area is picked out as a test set each time, the other 5 areas are taken as a training set, and finally, the arithmetic mean value of the 6 groups of experiments is obtained to obtain the Overall semantic segmentation result of the S3DIS data set, as shown in table 1:
TABLE 1S 3DIS data set Overall semantic segmentation results
Figure BDA0002311887950000051
It can be seen from table 1 that the OA value of PointNet is 79.1%, the mlio u value is 46.7%, the obtained OA value is 1.1% higher than that of PointNet, and the mlio u value is 1.0% higher than that of PointNet respectively. This is sufficient to show that mining dependencies between feature channels is meaningful for semantic segmentation tasks.

Claims (4)

1. A point cloud semantic segmentation method is characterized by comprising the following steps:
step 1, point cloud data preprocessing is carried out, wherein the point cloud data preprocessing comprises operations of blocking, sampling, translation and normalization;
step 2, designing a feature selection module, and establishing a neural network model based on feature selection;
and 3, training a neural network for semantic segmentation of subsequent point clouds.
2. The point cloud semantic segmentation method according to claim 1, wherein in step 1, the point cloud data is preprocessed by a specific method comprising:
firstly, point cloud data is divided into a plurality of cubic blocks, then random sampling is carried out in each block, when the point number in each block is larger than a set threshold value, excessive points are discarded, when the point number is smaller than the set threshold value, points are randomly picked from the blocks to be copied until the point number reaches the set threshold value, and data sampling is finished; the sampled point cloud data is a 6-dimensional vector comprising XYZ coordinate values and RGB color values, a point with the minimum coordinate value of XYZ is taken as a coordinate origin, coordinate values of other points are correspondingly calculated by using a formula (1), data translation is completed, and X ', Y ' and Z ' are obtained; normalizing the X ', Y ' and Z ' by a formula (2), and adding 3-dimensional new coordinate values X, Y and Z; normalizing RGB by using a formula (3) to obtain normalized color values R ', G ' and B ', and finally outputting the processed 9-dimensional point cloud data (X ', Y ', Z ', X, Y, Z, R ', G ' and B '):
X'=X-Xmin;Y'=Y-Ymin;Z'=Z-Zmin(1)
Figure FDA0002311887940000011
Figure FDA0002311887940000012
in the formula, Xmin、Ymin、ZminIs the minimum of XYZ coordinate values, Xmax、Ymax、ZmaxThe maximum value of the XYZ coordinate values.
3. The point cloud semantic segmentation method according to claim 1, wherein in the step 2, a specific method for establishing the neural network model based on feature selection is as follows:
(a) neural network module design
Inputting point cloud data into a neural network, obtaining a characteristic matrix after 5 layers of MLPs, and obtaining local characteristics of the point cloud through a characteristic selection module; then, after maximum pooling and two-layer MLP, obtaining the global characteristics of the point cloud; finally, the local features and the global features are spliced together, and the semantic category of each point is obtained through three-layer MLP operation;
(b) feature selection module design
The feature selection comprises a maximum pool, two layers of MLPs, an adder and a multiplier, and the input of the feature selection module is set as a feature matrix M1The feature selection module selects a feature matrix M1Obtaining a weight vector W through maximum pooling and two-layer MLP, and then adding W to M1Multiplying each row vector to obtain a feature matrix M2(ii) a Finally, M is1And M2Adding matrix elements to obtain a feature matrix M3I.e. local features of the point cloud.
4. The point cloud semantic segmentation according to claim 1, wherein in the step 3, a concrete method for training a neural network is as follows:
optimizing the cross entropy loss function of the network by using a cross entropy loss function and an ADAM algorithm with a momentum parameter of 0.9, and training the network by using a batch with the size of 32; in the training process, the learning rate is changed, the initial learning rate is 0.001, and the learning rate is reduced to 0.5 time of the original learning rate every 20 training cycles, so that the learning attenuation rate is gradually increased to 0.99 from 0.5.
CN201911262240.8A 2019-12-10 2019-12-10 Point cloud semantic segmentation method Active CN111223120B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911262240.8A CN111223120B (en) 2019-12-10 2019-12-10 Point cloud semantic segmentation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911262240.8A CN111223120B (en) 2019-12-10 2019-12-10 Point cloud semantic segmentation method

Publications (2)

Publication Number Publication Date
CN111223120A true CN111223120A (en) 2020-06-02
CN111223120B CN111223120B (en) 2023-08-04

Family

ID=70830735

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911262240.8A Active CN111223120B (en) 2019-12-10 2019-12-10 Point cloud semantic segmentation method

Country Status (1)

Country Link
CN (1) CN111223120B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109523552A (en) * 2018-10-24 2019-03-26 青岛智能产业技术研究院 Three-dimension object detection method based on cone point cloud
US20190108639A1 (en) * 2017-10-09 2019-04-11 The Board Of Trustees Of The Leland Stanford Junior University Systems and Methods for Semantic Segmentation of 3D Point Clouds
CN110322453A (en) * 2019-07-05 2019-10-11 西安电子科技大学 3D point cloud semantic segmentation method based on position attention and auxiliary network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190108639A1 (en) * 2017-10-09 2019-04-11 The Board Of Trustees Of The Leland Stanford Junior University Systems and Methods for Semantic Segmentation of 3D Point Clouds
CN109523552A (en) * 2018-10-24 2019-03-26 青岛智能产业技术研究院 Three-dimension object detection method based on cone point cloud
CN110322453A (en) * 2019-07-05 2019-10-11 西安电子科技大学 3D point cloud semantic segmentation method based on position attention and auxiliary network

Also Published As

Publication number Publication date
CN111223120B (en) 2023-08-04

Similar Documents

Publication Publication Date Title
CN112750140B (en) Information mining-based disguised target image segmentation method
CN110660062B (en) Point cloud instance segmentation method and system based on PointNet
CN108764317B (en) Residual convolutional neural network image classification method based on multipath feature weighting
CN108664981B (en) Salient image extraction method and device
CN111340814A (en) Multi-mode adaptive convolution-based RGB-D image semantic segmentation method
CN110222760B (en) Quick image processing method based on winograd algorithm
CN113822209B (en) Hyperspectral image recognition method and device, electronic equipment and readable storage medium
CN107564009B (en) Outdoor scene multi-target segmentation method based on deep convolutional neural network
CN107610146A (en) Image scene segmentation method, apparatus, computing device and computer-readable storage medium
CN109934272B (en) Image matching method based on full convolution network
CN110889416B (en) Salient object detection method based on cascade improved network
CN108205703B (en) Multi-input multi-output matrix average value pooling vectorization implementation method
CN111476247B (en) CNN method and device using 1xK or Kx1 convolution operation
CN107679539B (en) Single convolution neural network local information and global information integration method based on local perception field
CN115147598B (en) Target detection segmentation method and device, intelligent terminal and storage medium
CN111476341A (en) Method and device for converting CNN convolution layer
CN107292458A (en) A kind of Forecasting Methodology and prediction meanss applied to neural network chip
CN111008631B (en) Image association method and device, storage medium and electronic device
JP2020119524A (en) Learning method and learning device for extracting feature from input image in multiple blocks in cnn, so that hardware optimization which can satisfies core performance index can be performed, and testing method and testing device using the same
JP7085600B2 (en) Similar area enhancement method and system using similarity between images
CN115984701A (en) Multi-modal remote sensing image semantic segmentation method based on coding and decoding structure
CN110110849B (en) Line fixed data stream mapping method based on graph segmentation
CN105354228A (en) Similar image searching method and apparatus
CN114581789A (en) Hyperspectral image classification method and system
CN114202026A (en) Multitask model training method and device and multitask processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant