CN116310350A - Urban scene semantic segmentation method based on graph convolution and semi-supervised learning network - Google Patents

Urban scene semantic segmentation method based on graph convolution and semi-supervised learning network Download PDF

Info

Publication number
CN116310350A
CN116310350A CN202310596881.7A CN202310596881A CN116310350A CN 116310350 A CN116310350 A CN 116310350A CN 202310596881 A CN202310596881 A CN 202310596881A CN 116310350 A CN116310350 A CN 116310350A
Authority
CN
China
Prior art keywords
point
points
network
category
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310596881.7A
Other languages
Chinese (zh)
Other versions
CN116310350B (en
Inventor
王程
陈钧
陈一平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN202310596881.7A priority Critical patent/CN116310350B/en
Publication of CN116310350A publication Critical patent/CN116310350A/en
Application granted granted Critical
Publication of CN116310350B publication Critical patent/CN116310350B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a city scene semantic segmentation method based on a graph convolution and a semi-supervised learning network, which comprises the following steps: s1, a network is rolled through a pre-training chart to obtain initialization parameters; s2, inputting an original point set once
Figure ZY_1
Outputting the feature vector
Figure ZY_2
The method comprises the steps of carrying out a first treatment on the surface of the S3, for the original point set
Figure ZY_3
Computing feature vectors from the neighborhood of each point
Figure ZY_4
The method comprises the steps of carrying out a first treatment on the surface of the S4, calculating a feature vector
Figure ZY_5
And
Figure ZY_6
as a loss function to adjust parameters of the graph rolling network; s5, using the labeled data to distribute pseudo labels for the unlabeled data; s6, distributing the pseudo tag
Figure ZY_7
Semantic segmentation is performed and the class of each point is predicted. The method can realize the semantic segmentation of the urban road scene only by a small amount of marked data.

Description

Urban scene semantic segmentation method based on graph convolution and semi-supervised learning network
Technical Field
The invention relates to the field of computer graphics, in particular to a city scene semantic segmentation method based on graph convolution and a semi-supervised learning network.
Background
The point cloud is used as unstructured three-dimensional data, compared with data formats such as voxels and grids, the point cloud characterizes objects more accurately, flexibly and variously, and has wide application in the field of smart cities. For example, in planning urban construction, a digital traffic map is generated by using point clouds, and planning of traffic lines, urban construction and the like is assisted, so that planning efficiency and precision are improved; in environment monitoring and analysis, the point cloud data can be utilized to carry out three-dimensional modeling on an actual scene, so that the three-dimensional modeling method is used for analyzing conditions such as landforms, hydrogeology, building damage and the like, and is convenient for city management and maintenance.
In practical smart city applications, the point cloud can be generally divided into the following five steps from acquisition to application: (1) point cloud acquisition; (2) point cloud preprocessing; (3) extracting point cloud characteristics; (4) point cloud semantic segmentation; (5) downstream model deployment and application.
One difficulty with the above steps is that feature extraction and semantic segmentation require a large amount of annotation data for model training. In terms of feature extraction and semantic segmentation, the traditional method adopts a manual design feature descriptor to extract features or adopts a deep learning method to automatically extract features by using a neural network, so that the semantic segmentation is respectively realized. However, the training process of these methods is usually supervised learning, i.e. a large amount of labeling data is required for model training. However, the point cloud of the urban scene is huge in scale, and if all points are manually marked, the process is cumbersome and expensive.
Disclosure of Invention
The invention aims to overcome the difficulty that a large amount of annotation data is needed in an urban scene semantic segmentation algorithm, and provides an urban scene semantic segmentation method based on a graph convolution and a semi-supervised learning network.
The city scene semantic segmentation method based on the graph convolution and the semi-supervised learning network comprises the following steps:
s1, pre-training a graph convolution network by using a public and marked urban road data set to obtain initialization parameters of each layer in the graph convolution network;
s2, inputting an original point set into the initialized graph rolling network at one time
Figure SMS_1
,/>
Figure SMS_2
The points in (a) contain only the coordinate xyz and the color rgb information, and the feature vector +.>
Figure SMS_3
S3, the original point set in the step S2
Figure SMS_4
Using k-NN to find k adjacent points to form neighborhood, calculating feature vector according to neighborhood of each point>
Figure SMS_5
S4, calculating a feature vector
Figure SMS_6
and />
Figure SMS_7
As a loss function for adjusting the parameters of the graph roll-up network in step S2;
s5, gathering the original points
Figure SMS_8
As target semantic segmentation dataset +.>
Figure SMS_9
It contains labeled data and unlabeled data, wherein the data volume of the labeled data occupies the original point set +.>
Figure SMS_10
The ratio of the label data is 1% -10%, and then pseudo labels are distributed to the label-free data in the semi-supervised learning network by using the label data;
s6, distributing the pseudo tag in the step S5
Figure SMS_11
For network reasoning, semantics divide and predict the category of each point.
Further, the step S2 specifically includes:
s21, using the encoder of the graph rolling network to perform the initial point set
Figure SMS_12
Coding to obtain coding feature->
Figure SMS_13
S22, reusing the decoder pair coding features of the graph rolling network
Figure SMS_14
Decoding to obtain decoding characteristic->
Figure SMS_15
S23, decoding the characteristics through MLP
Figure SMS_16
The mapping output is the feature vector +.>
Figure SMS_17
The method comprises the steps of carrying out a first treatment on the surface of the Wherein the characteristic directionQuantity->
Figure SMS_18
The dimension of each point in (a) is expressed as +.>
Figure SMS_19
,/>
Figure SMS_20
and />
Figure SMS_21
Representing the coded coordinates and color features, respectively, the subscript of r represents the feature channel, 1 in the superscript of r represents the mean, and 2 represents the variance.
Further, the step S3 specifically includes:
for the original point set in step S2
Figure SMS_22
Using k-NN to find k adjacent points to form neighborhood, and calculating feature vector according to neighborhood of each point>
Figure SMS_23
Wherein the dimension of each point is expressed as:
Figure SMS_24
the calculation process is as follows:
Figure SMS_25
Figure SMS_26
Figure SMS_27
wherein ,
Figure SMS_28
mean value of neighborhood coordinate channel representing each point, +.>
Figure SMS_29
Mean value of neighborhood color channels representing each point, < >>
Figure SMS_30
Representing the variance of the neighborhood color channel for each point,/->
Figure SMS_31
Taking 1,2,3, the self-learning process is set with +.>
Figure SMS_32
And->
Figure SMS_33
Each feature channel corresponds to a feature distance calculated, and n represents an index of k neighboring points of each point.
Further, the step S4 specifically includes:
let us say that the original set of points entered into the graph rolling network in step S2
Figure SMS_34
Contains->
Figure SMS_35
The coordinate distance is calculated as the Euclidean distance, and the color distance is calculated as the Manhattan distance;
loss function of coordinate distance
Figure SMS_36
The method comprises the following steps:
Figure SMS_37
loss function of color distance
Figure SMS_38
The method comprises the following steps:
Figure SMS_39
finally, the loss function is:
Figure SMS_40
wherein ,
Figure SMS_41
for the original point set->
Figure SMS_42
Index of each point in ∈ ->
Figure SMS_43
and />
Figure SMS_44
Is two superparameters in the graph rolling network
Figure SMS_45
and />
Figure SMS_46
Are respectively set to 1/3 and 2/3; the loss function is used to train the graph convolution network in step S2 to further adjust the parameters of its encoder and decoder.
Further, the step S5 specifically includes:
s51, gathering the original points
Figure SMS_48
The target semantic segmentation dataset is +.>
Figure SMS_52
Then->
Figure SMS_55
Is a group comprising
Figure SMS_49
The point of each point is set with the original point set +.>
Figure SMS_50
The point set with tag data in is +.>
Figure SMS_53
The number of points is +.>
Figure SMS_56
The point set of the unlabeled data is +.>
Figure SMS_47
The number of points is +.>
Figure SMS_51
There is->
Figure SMS_54
and />
Figure SMS_57
S52, using the encoder and decoder trained and adjusted in the step S4, outputting in the step S4
Figure SMS_58
The MLP of dimension is replaced by output +.>
Figure SMS_59
Dimension MLP, and will output +.>
Figure SMS_60
The dimension vector is denoted as->
Figure SMS_61
S53, will
Figure SMS_62
The feature corresponding to the point containing the tag is expressed as +.>
Figure SMS_63
The feature corresponding to the unlabeled point is expressed as
Figure SMS_64
The method comprises the steps of carrying out a first treatment on the surface of the Then->
Figure SMS_65
wherein ,
Figure SMS_68
and />
Figure SMS_69
All are->
Figure SMS_72
Vectors of dimensions and using indices for distinguishing between different points, and +.>
Figure SMS_67
Comprises the following components
Figure SMS_70
Category 0 </o>
Figure SMS_73
≤/>
Figure SMS_74
,/>
Figure SMS_66
Is->
Figure SMS_71
The actual category number needed to be semantically divided;
s54, selecting the data belonging to the category from the known tagged data
Figure SMS_75
And calculating the feature average of these points to obtain the class average feature vector +.>
Figure SMS_76
Figure SMS_77
wherein ,
Figure SMS_79
the expression category is +.>
Figure SMS_81
The number of points, +.>
Figure SMS_84
Representation->
Figure SMS_78
The category of the corresponding point is->
Figure SMS_82
Then, input +.>
Figure SMS_85
Calculating an average eigenvector +.>
Figure SMS_87
, wherein />
Figure SMS_80
The method comprises the steps of carrying out a first treatment on the surface of the For the following
Figure SMS_83
In the remaining non-existent categories,/->
Figure SMS_86
Marking as zero vector;
s55, calculating the point of the label-free data
Figure SMS_88
Feature vector +.>
Figure SMS_89
And->
Figure SMS_90
Similarity matrix->
Figure SMS_91
Figure SMS_92
wherein ,
Figure SMS_93
the euclidean distance of the average feature vector representing the category to the vector corresponding to the unlabeled point,
Figure SMS_94
the superscript of (1) indicates category, ">
Figure SMS_95
The subscript of (2) indicates a certain point, and +.>
Figure SMS_96
,/>
Figure SMS_97
The base representing the natural logarithm, the index in brackets,/-for it>
Figure SMS_98
Is +.>
Figure SMS_99
S56, the feature vector in the step S53
Figure SMS_100
Mapping to vector +.>
Figure SMS_101
As a result of the prediction,
Figure SMS_102
is>
Figure SMS_103
For the following
Figure SMS_104
Labeled points, class prediction is directly realized by using a Softmax classifier and a cross entropy loss function, and the loss function calculated by the points is +.>
Figure SMS_105
For the following
Figure SMS_107
The unlabeled dots are first generated into pseudo-labels and then used for sum +.>
Figure SMS_109
Comparison, specific: first selecting similarity matrix category by category>
Figure SMS_112
Highest confidence +.>
Figure SMS_108
Dots, assume common selection->
Figure SMS_111
Dots (/ -)>
Figure SMS_113
≤/>
Figure SMS_114
≤/>
Figure SMS_106
) Then selecting the category with highest confidence level for the selected points point by point, and updating the category>
Figure SMS_110
Maximum confidence of pseudo tags of the points and corresponding tag values;
will be
Figure SMS_115
The predictive loss function of each unlabeled dot is designed to:
Figure SMS_116
wherein the subscript
Figure SMS_118
Representation->
Figure SMS_120
Any one of the points, s represents +.>
Figure SMS_123
Index of individual unlabeled dots, +.>
Figure SMS_117
Is->
Figure SMS_121
The number of categories contained in the list, m represents the index of the number of categories,/->
Figure SMS_124
Probability value representing final predictive label, +.>
Figure SMS_125
Representing a pseudo tag category, and when the pseudo tag category and the predicted category are the same,mtaking 1, otherwise taking 0,>
Figure SMS_119
indicating when the point is +.>
Figure SMS_122
Taking 1 when in, otherwise taking 0;
s57, loss function of whole graph convolution network
Figure SMS_126
The method comprises the following steps:
Figure SMS_127
wherein the weight is
Figure SMS_128
The method comprises the following steps:
Figure SMS_129
wherein ,epochrepresenting the current training round of the present time,max-epochrepresenting the maximum training round, at the beginning
Figure SMS_130
Less weight is used.
Further, the step S6 specifically includes:
iterating through the process training network in which pseudo tags are assigned in steps S51-S57 until a target data set is reached
Figure SMS_131
And (3) performing upper convergence, and removing a similarity matrix by using the trained network in the final prediction process>
Figure SMS_132
Is (are) calculated for->
Figure SMS_133
All points in (1) were using a Softmax classifier and then read +.>
Figure SMS_134
The rest of the data set is iterated to realize the goal data set +.>
Figure SMS_135
Semantic segmentation and class prediction of all points.
After the technical scheme is adopted, compared with the background technology, the invention has the following advantages:
1. the invention adopts the idea of transfer learning, fully utilizes the similar characteristics of different urban scenes, utilizes the disclosed labeling data set to obtain the initialization parameters of the graph rolling network, and is beneficial to improving the stability of the neural network in the representation of different data sets;
2. the invention adopts a self-learning pre-training task, fully utilizes the local and color characteristics of objects in urban scenes, does not need to use labeling data, and can learn the priori distribution of the data to finely adjust network parameters;
3. the invention uses semi-supervised learning to reduce the dependence on the labeling data, so that the high-quality pseudo labels can be generated by using a small amount of labeled data, the effect of semi-supervised learning is improved, the semantic segmentation of the target data set is realized, and the dependence on the manual labeling data is greatly reduced.
Drawings
FIG. 1 is a flow chart of the pre-training graph rolling network fine tuning parameters of the present invention;
fig. 2 is a flowchart of a training process for generating a pseudo tag by the semi-supervised learning network of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Examples
The city scene semantic segmentation method based on the graph convolution and the semi-supervised learning network comprises the following steps:
the initial parameters are obtained by the pre-training graph rolling network (realized by the following step S1)
S1, pre-training a graph convolution network by using a public and labeled data set to obtain initialization parameters of each layer in the graph convolution network;
the encoder of the graph rolling network consists of a multi-layer perceptron layer and four graph rolling modules, wherein the four graph rolling modules are numbered as 1,2,3 and 4 in sequence; the input features of the graph convolution module are expressed as
Figure SMS_136
The output characteristic is->
Figure SMS_137
, wherein />
Figure SMS_138
I.e. the output of the previous convolution module is the input of the next convolution module and the current module input point is +.>
Figure SMS_139
Input feature dimension is->
Figure SMS_140
The characteristic dimension after convolution is +.>
Figure SMS_141
The number of points is reduced to +.>
Figure SMS_142
The embodiment adopts public and marked urban road data as a pre-training data set for pre-training. The Toronto3D of the public data set acquires a 1km high-quality city market scenery spot cloud by using a vehicle-mounted laser radar, and the data set manually marks more than eight millions of spots, so that 8 common city scene categories are covered: highways, zebra crossings, buildings, power lines, power towers, automobiles, and fences, and all contain coordinate and color information. Selecting a point cloud in Toronto3D
Figure SMS_143
As input to the pre-trained graph convolution network, in this embodiment +.>
Figure SMS_144
If the number of points (1) is 65536 +.>
Figure SMS_145
Is [65536, 6 ]]。
First, an encoder is used to obtain encoding features
Figure SMS_146
Wherein MLP (i.e. multi-layer perceptron) is used to determine [65536, 6]Mapping to [65536,16 ]]Then inputting the extracted features into four graph convolution modules; then the decoding characteristics are obtained through a decoder>
Figure SMS_147
Its dimension is [65536,16 ]]The method comprises the steps of carrying out a first treatment on the surface of the Then, the pair ++is implemented using fully connected network and Softmax classifier>
Figure SMS_148
Class prediction for each point in the network, the characteristics of the network for each pointThe symptomatic dimensional change is set to (16, 64, 64, 8); then, cross entropy is used as a loss function, random gradient descent is optimized, the network is pre-trained, and parameters of each layer of the network are updated; finally, the above process is repeated until the network converges. The convergence condition is set to be 100 rounds of fixed training, and if the prediction accuracy of 20 continuous rounds is not improved, the training can be stopped. Instead of random initialization, the network of the present embodiment employs pre-trained parameter initialization.
(II) performing self-learning training task to achieve parameter fine adjustment (achieved by the following steps S2-S4)
S2, inputting an original point set into the initialized graph rolling network at one time
Figure SMS_149
,/>
Figure SMS_150
The points in (a) contain only the coordinate xyz and the color rgb information, and the feature vector +.>
Figure SMS_151
S3, the original point set in the step S2
Figure SMS_152
Using k-NN to find k adjacent points to form neighborhood, calculating feature vector according to neighborhood of each point>
Figure SMS_153
S4, calculating a feature vector
Figure SMS_154
and />
Figure SMS_155
As a loss function for adjusting the parameters of the graph roll-up network in step S2;
the step S2 specifically includes:
since step S1 has initialized the layer parameters of the graph rolling network, step S2 uses only the encoder and decoder portions thereof, modifies the fully connected network and Softmax classifier in step S1 to the MLP layer, and then performs the following steps.
S21, using the encoder of the graph rolling network to perform the initial point set
Figure SMS_156
Coding to obtain coding feature->
Figure SMS_157
In particular, for an original set of points that are input into the graph rolling network at one time
Figure SMS_158
Wherein the dots contain only 6-dimensional features of the coordinates xyz and the colors rgb, denoted as feature +.>
Figure SMS_159
Will->
Figure SMS_160
Mapping to 16 dimension by a multi-layer perceptron, the output is characterized by +>
Figure SMS_161
And is used as the input of the first graph rolling module, the output of the former graph rolling module is the input of the next graph rolling module, and the characteristic is output after passing through the four graph rolling modules ∈ ->
Figure SMS_162
I.e. the final coding feature +.>
Figure SMS_163
S22, reusing the decoder pair coding features of the graph rolling network
Figure SMS_164
Decoding to obtain decoding characteristic->
Figure SMS_165
The coding features obtained in S21
Figure SMS_168
Feature ∈The same-dimensional mapping using MLP>
Figure SMS_169
And as decoder input, decoding characteristics +.>
Figure SMS_172
Decoding after the up-sampling of the adjacent points and the jump connection of the MLP down-sum encoder to obtain the output characteristic +.>
Figure SMS_166
. Wherein the decoding feature uses the subscript +.>
Figure SMS_170
And superscript->
Figure SMS_173
In distinction to the features of the encoder,
Figure SMS_175
and sequentially taking 4,3,2 and 1. The encoder is jump-connected to the decoder, i.e. has coding features of the same dimension +.>
Figure SMS_167
Decoding characteristics->
Figure SMS_171
The added features are used as input features for subsequent layers. Decoding characteristics->
Figure SMS_174
Namely decoding characteristic->
Figure SMS_176
S23, decoding the characteristics through MLP
Figure SMS_177
The mapping output is the feature vector +.>
Figure SMS_178
The method comprises the steps of carrying out a first treatment on the surface of the Wherein the feature vector->
Figure SMS_179
The dimension of each point in (a) is expressed as +.>
Figure SMS_180
,/>
Figure SMS_181
and />
Figure SMS_182
Representing the coded coordinates and color features, respectively, the subscript of r represents the feature channel, 1 in the superscript of r represents the mean, and 2 represents the variance.
The step S3 specifically comprises the following steps:
first, for the original point set in step S2
Figure SMS_183
Using k-NN to find k adjacent points to form a neighborhood;
specifically, for the original point set input to the network
Figure SMS_184
Each point of->
Figure SMS_185
Finding a set of nearest neighbors using k-NN
Figure SMS_186
Coordinate information is then embedded:
Figure SMS_187
= LBR(/>
Figure SMS_188
, />
Figure SMS_189
, />
Figure SMS_190
,/>
Figure SMS_191
)
wherein the coordinate features
Figure SMS_194
From the point->
Figure SMS_196
Adjacent point->
Figure SMS_198
Is obtained, in particular, by the absolute coordinates of the two points being linked +.>
Figure SMS_193
and />
Figure SMS_195
Offset->
Figure SMS_197
And spatial distance->
Figure SMS_199
The method comprises the steps of carrying out a first treatment on the surface of the Sign symbolLBRThen the connected feature vector is represented to sequentially pass through the Linear layer, the BatchNorm layer and the ReLU layer, and the +_in the graph rolling module>
Figure SMS_192
Is mapped to the same dimension as the point set features it inputs.
Then, the dot
Figure SMS_200
And adjacent point->
Figure SMS_201
The relation of (2) is expressed as the side relation +.>
Figure SMS_202
Figure SMS_203
= R(g(/>
Figure SMS_204
))
wherein ,
Figure SMS_205
input +.>
Figure SMS_206
Point set feature of the individual graph convolution +.>
Figure SMS_207
And its coordinate feature->
Figure SMS_208
Using a learnable weight after connectiongThe weighting is performed so that the weight of the sample,gcan be realized by using MLP, 1D-CNN and the like; />
Figure SMS_209
Representing the ReLU layer. Finally, each point is aggregated channel by channel using Max-Pooling>
Figure SMS_210
And uses random sampling to reduce the number of points to obtain output characteristics
Figure SMS_211
Feature vectors are then calculated from the neighborhood of each point
Figure SMS_212
Wherein the dimension of each point is expressed as:
Figure SMS_213
the calculation process is as follows:
Figure SMS_214
Figure SMS_215
Figure SMS_216
wherein ,
Figure SMS_217
mean value of neighborhood coordinate channel representing each point, +.>
Figure SMS_218
Mean value of neighborhood color channels representing each point, < >>
Figure SMS_219
Representing the variance of the neighborhood color channel for each point,/->
Figure SMS_220
Taking 1,2,3, the self-learning process is set with +.>
Figure SMS_221
And->
Figure SMS_222
Each feature channel corresponds to a feature distance calculated, and n represents an index of k neighboring points of each point. Because the urban street scene point clouds are distributed sparsely and unevenly, the neighborhood coordinate variance is larger, and therefore, the network only uses the coordinate mean value. The vegetation (green), pavement (gray) and other objects have obvious color characteristics, and the local color change is generally smooth, so that the color adopts two characteristics of mean and variance.
The step S4 specifically includes:
let us say that the original set of points entered into the graph rolling network in step S2
Figure SMS_223
Contains->
Figure SMS_224
The coordinate distance is calculated as the Euclidean distance, and the color distance is calculated as the Manhattan distance;
loss function of coordinate distance
Figure SMS_225
The method comprises the following steps:
Figure SMS_226
loss function of color distance
Figure SMS_227
The method comprises the following steps:
Figure SMS_228
finally, the loss function is:
Figure SMS_229
wherein ,
Figure SMS_230
for the original point set->
Figure SMS_231
Index of each point in ∈ ->
Figure SMS_232
and />
Figure SMS_233
Is two superparameters in the graph rolling network
Figure SMS_234
and />
Figure SMS_235
Are respectively set to 1/3 and 2/3; the loss function is used to train the graph convolution network in step S2 to further adjust the parameters of its encoder and decoder.
The steps S2-S4 realize fine adjustment of parameters of the pre-trained graph rolling network. Specifically, in the present embodiment, the first and second embodiments,
(1) The pre-trained encoder and decoder are fixed first, and the subsequent full-connection layer of the decoder is changed into a multi-layer perceptron (i.e. MLP) which sets (16,32,9) the feature dimension variation for each point. Partitioning data sets from target semantics
Figure SMS_236
Constructing a point cloud->
Figure SMS_237
(just change data set as in the previous pre-training construction), the point cloud is passed through the network output feature of FIG. 1 +.>
Figure SMS_238
Its dimension is [65536, 9]。
(2) At the same time to
Figure SMS_239
And respectively constructing a neighborhood by using k-NN for each point, wherein the number k of the neighborhood points is set to be 16. The characteristic calculation mode of one point is as follows:
Figure SMS_240
Figure SMS_241
Figure SMS_242
wherein ,
Figure SMS_243
mean value of neighborhood coordinate channel representing the point, +.>
Figure SMS_244
Mean value of neighborhood color channels representing the point,/>
Figure SMS_245
Representing the variance of the neighborhood color channel for that point. i 1,2,3, the self-learning procedure is set up->
Figure SMS_246
And->
Figure SMS_247
Each feature channel corresponds to a feature distance calculated, and n represents an index of k neighboring points of each point. The characteristics that give this point are expressed as:
Figure SMS_248
then
Figure SMS_249
Constructed->
Figure SMS_250
All points of (2) are characterized by->
Figure SMS_251
Its dimension is [65536, 9]。
(3) Calculation of
Figure SMS_252
and />
Figure SMS_253
Is trained as a function of the loss of the network of fig. 1. The feature associated with the coordinates is calculated as the euclidean distance and the feature distance associated with the color is calculated as the manhattan distance. Coordinate distance loss function
Figure SMS_254
The method comprises the following steps:
Figure SMS_255
color distance loss function
Figure SMS_256
The method comprises the following steps:
Figure SMS_257
finally, the loss function is:
Figure SMS_258
wherein ,
Figure SMS_259
for the original point set->
Figure SMS_260
Index of each point in ∈ ->
Figure SMS_261
and />
Figure SMS_262
Is two super parameters, set to 1/3 and 2/3.
Training was optimized using random gradient descent, fixed training 30 rounds. The pre-training is used to fine tune the parameters of the encoder and decoder to adapt it to
Figure SMS_263
Is encoded by the data set of (a).
(III) generating pseudo tags and semantic segmentation by using semi-supervised learning network (realized by the following steps S5 and S6)
S5, gathering the original points
Figure SMS_264
As target semantic segmentation dataset +.>
Figure SMS_265
Which contains a small amount of tagged data and a large amount of untagged data, wherein the amount of tagged data occupies the original point set +.>
Figure SMS_266
The ratio of the label data is 1% -10%, and then pseudo labels are distributed to the label-free data in the semi-supervised learning network by using the label data;
s6, distributing the pseudo tag in the step S5
Figure SMS_267
For network reasoning, semantics divide and predict the category of each point.
The step S5 specifically comprises the following steps:
s51, gathering the original points
Figure SMS_268
The target semantic segmentation dataset is +.>
Figure SMS_271
Then->
Figure SMS_274
Is a group comprising
Figure SMS_269
The point of each point is set with the original point set +.>
Figure SMS_272
The point set with tag data in is +.>
Figure SMS_275
The number of points is +.>
Figure SMS_277
The point set of the unlabeled data is +.>
Figure SMS_270
The number of points is +.>
Figure SMS_273
There is->
Figure SMS_276
and />
Figure SMS_278
S52, using the encoder and decoder trained and adjusted in the step S4, outputting in the step S4
Figure SMS_279
The MLP of dimension is replaced by output +.>
Figure SMS_280
Dimension MLP, and will output +.>
Figure SMS_281
The dimension vector is denoted as->
Figure SMS_282
S53, will
Figure SMS_283
The feature corresponding to the point containing the tag is expressed as +.>
Figure SMS_284
The feature corresponding to the unlabeled point is expressed as
Figure SMS_285
The method comprises the steps of carrying out a first treatment on the surface of the Then->
Figure SMS_286
wherein ,
Figure SMS_287
and />
Figure SMS_290
All are->
Figure SMS_293
Vectors of dimensions and using indices for distinguishing between different points, and +.>
Figure SMS_288
Comprises the following components
Figure SMS_292
The number of categories of the product,0</>
Figure SMS_294
≤/>
Figure SMS_295
,/>
Figure SMS_289
is->
Figure SMS_291
The actual category number needed to be semantically divided;
s54, selecting the data belonging to the category from the known tagged data
Figure SMS_296
And calculating the feature average of these points to obtain the class average feature vector +.>
Figure SMS_297
Figure SMS_298
wherein ,
Figure SMS_300
the expression category is +.>
Figure SMS_302
The number of points, +.>
Figure SMS_305
Representation->
Figure SMS_301
The category of the corresponding point is->
Figure SMS_303
Then, input +.>
Figure SMS_306
Calculating an average eigenvector +.>
Figure SMS_308
, wherein />
Figure SMS_299
The method comprises the steps of carrying out a first treatment on the surface of the For the following
Figure SMS_304
In the remaining non-existent categories,/->
Figure SMS_307
Marking as zero vector;
s55, calculating the point of the label-free data
Figure SMS_309
Feature vector +.>
Figure SMS_310
And->
Figure SMS_311
Similarity matrix->
Figure SMS_312
Figure SMS_313
wherein ,
Figure SMS_314
the euclidean distance of the average feature vector representing the category to the vector corresponding to the unlabeled point,
Figure SMS_315
the superscript of (1) indicates category, ">
Figure SMS_316
The subscript of (2) indicates a certain point, and +.>
Figure SMS_317
,/>
Figure SMS_318
The base representing the natural logarithm, the index in brackets,/-for it>
Figure SMS_319
Is +.>
Figure SMS_320
S56, the feature vector in the step S53
Figure SMS_321
Mapping to vector +.>
Figure SMS_322
As a result of the prediction,
Figure SMS_323
is>
Figure SMS_324
For the following
Figure SMS_325
Labeled points, class prediction is directly realized by using a Softmax classifier and a cross entropy loss function, and the loss function calculated by the points is +.>
Figure SMS_326
For the following
Figure SMS_327
The unlabeled dots are first generated into pseudo-labels and then used for sum +.>
Figure SMS_328
In contrast, if points with low confidence in the pseudo tag are used, larger errors will be generated in the segmentation result. Thus, the similarity matrix may be selected on a category-by-category basis
Figure SMS_329
Highest confidence +.>
Figure SMS_330
And selecting the category with highest confidence level for the selected points point by point.
Specific: selecting similarity matrices category by category
Figure SMS_331
Highest confidence +.>
Figure SMS_332
Point, assume co-selection
Figure SMS_333
Dots (/ -)>
Figure SMS_334
≤/>
Figure SMS_335
≤/>
Figure SMS_336
) Selecting the category with highest confidence level for the selected points point by point, and updating
Figure SMS_337
Maximum confidence of pseudo tags of the points and corresponding tag values;
will be
Figure SMS_338
The predictive loss function of each unlabeled dot is designed to:
Figure SMS_339
wherein the subscript
Figure SMS_341
Representation->
Figure SMS_343
Any one of the points, s represents +.>
Figure SMS_346
Index of individual unlabeled dots, +.>
Figure SMS_342
Is->
Figure SMS_344
The number of categories contained in the list, m represents the index of the number of categories,/->
Figure SMS_347
Probability value representing final predictive label, +.>
Figure SMS_348
Representing a pseudo tag category, and when the pseudo tag category and the predicted category are the same,mtaking 1, otherwise taking 0,>
Figure SMS_340
indicating when the point is +.>
Figure SMS_345
Taking 1 when in, otherwise taking 0;
s57, loss function of whole graph convolution network
Figure SMS_349
The method comprises the following steps: />
Figure SMS_350
Wherein the weight is
Figure SMS_351
The method comprises the following steps:
Figure SMS_352
wherein ,epochrepresenting the current training round of the present time,max-epochrepresenting the maximum training round, at the beginning
Figure SMS_353
Less weight is used.
The step S6 specifically includes:
iterating through the process training network in which pseudo tags are assigned in steps S51-S57 until a target data set is reached
Figure SMS_354
And (3) performing upper convergence, and removing a similarity matrix by using the trained network in the final prediction process>
Figure SMS_355
Is (are) calculated for->
Figure SMS_356
All points in (1) were using a Softmax classifier and then read +.>
Figure SMS_357
The rest of the data set is iterated to realize the goal data set +.>
Figure SMS_358
Semantic segmentation and class prediction of all points.
Specifically, the present embodiment modifies the subsequent layer of the decoder in fig. 1 into two MLPs, the feature dimension change of the first MLP for each point is set to (16,32,32), and the features are output
Figure SMS_359
Its dimension is [65536, 32]The method comprises the steps of carrying out a first treatment on the surface of the The second MLP is set to (32,32,8) for each point feature dimension variation, outputting the feature ∈ ->
Figure SMS_360
Its dimension is [65536, 8]The modified network architecture is shown in fig. 2.
Target semantic segmentation dataset
Figure SMS_361
It is necessary to contain a small amount of tagged data and a large amount of untagged data, and pseudo tags are assigned to the untagged data using the tagged data in step S5. Partitioning data sets from target semantics
Figure SMS_362
Constructing a point cloud->
Figure SMS_363
(the construction mode is the same as that in the previous self-learning pre-training task), and the ++is constructed each time during semi-supervised training>
Figure SMS_364
1% -10% of the points are required to be provided with marking information. For example, in one training round, 65536 points contain 4096 labeling points, and the labeling information accounts for 6.25% in total of 5 categories. Use->
Figure SMS_365
Calculating the characteristic average value of the labeling points corresponding to each category to obtain the category average characteristic vector +.>
Figure SMS_366
Figure SMS_367
wherein ,
Figure SMS_370
representation->
Figure SMS_371
Some marked point is +.>
Figure SMS_374
Corresponding features of the category +.>
Figure SMS_368
. Then, for 4096 points of the input +.>
Figure SMS_372
Average feature vector calculated for each category->
Figure SMS_375
, wherein />
Figure SMS_376
. For->
Figure SMS_369
In the remaining non-existent 3 categories, +.>
Figure SMS_373
Recorded as zero vector.
Next, unlabeled dots are calculated
Figure SMS_377
Feature vector +.>
Figure SMS_378
And->
Figure SMS_379
Similarity matrix->
Figure SMS_380
Figure SMS_381
wherein ,
Figure SMS_382
,/>
Figure SMS_383
is +.>
Figure SMS_384
Then, the feature vector is
Figure SMS_388
Mapping to vector +.>
Figure SMS_391
As a result of the prediction, < > for>
Figure SMS_395
Is>
Figure SMS_386
. Wherein 4096 labeled points directly implement class prediction using a Softmax classifier and a cross entropy loss function. But for->
Figure SMS_390
There are no tagged points and a pseudo tag needs to be generated for and + ->
Figure SMS_394
And (5) comparing. First selecting similarity matrix category by category>
Figure SMS_398
Highest confidence +.>
Figure SMS_387
And selecting the category with highest confidence level for the selected points point by point. Assume common selection->
Figure SMS_389
Dots (/ -)>
Figure SMS_393
≤/>
Figure SMS_397
≤/>
Figure SMS_385
) Then update this->
Figure SMS_392
Maximum confidence of pseudo tags of individual points and corresponding tag values. Wherein (1)>
Figure SMS_396
Set to 50% of the number of each category. Training runs were set to 100 and Adam was selected as the optimization method.
Finally, the trained network is utilized to remove the similarity matrix
Figure SMS_399
Is calculated by the calculation of (a),i.e. remove +.>
Figure SMS_400
To simi and simi to ∈>
Figure SMS_401
Is calculated by the computer. Then, for one time input to the network at the time of test +.>
Figure SMS_402
In (1) using a Softmax classifier, iterative construction +.>
Figure SMS_403
Until reading->
Figure SMS_404
And predicting tag values for all the points to realize semantic segmentation.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims (6)

1. The city scene semantic segmentation method based on the graph convolution and the semi-supervised learning network is characterized by comprising the following steps of: the method comprises the following steps:
s1, pre-training a graph convolution network by using a public and marked urban road data set to obtain initialization parameters of each layer in the graph convolution network;
s2, inputting an original point set into the initialized graph rolling network at one time
Figure QLYQS_1
,/>
Figure QLYQS_2
The points in (a) only contain the information of the coordinates xyz and the colors rgb, and the characteristic directions are output by using the graph rolling networkQuantity->
Figure QLYQS_3
S3, the original point set in the step S2
Figure QLYQS_4
Using k-NN to find k adjacent points to form neighborhood, calculating feature vector according to neighborhood of each point>
Figure QLYQS_5
S4, calculating a feature vector
Figure QLYQS_6
and />
Figure QLYQS_7
As a loss function for adjusting the parameters of the graph roll-up network in step S2;
s5, gathering the original points
Figure QLYQS_8
As target semantic segmentation dataset +.>
Figure QLYQS_9
It contains labeled data and unlabeled data, wherein the data volume of the labeled data occupies the original point set +.>
Figure QLYQS_10
The ratio of the label data is 1% -10%, and then pseudo labels are distributed to the label-free data in the semi-supervised learning network by using the label data;
s6, distributing the pseudo tag in the step S5
Figure QLYQS_11
For network reasoning, semantics divide and predict the category of each point.
2. The urban scene semantic segmentation method based on graph convolution and semi-supervised learning network as set forth in claim 1, wherein: the step S2 specifically comprises the following steps:
s21, using the encoder of the graph rolling network to perform the initial point set
Figure QLYQS_12
Coding to obtain coding feature->
Figure QLYQS_13
S22, reusing the decoder pair coding features of the graph rolling network
Figure QLYQS_14
Decoding to obtain decoding characteristic->
Figure QLYQS_15
S23, decoding the characteristics through MLP
Figure QLYQS_16
The mapping output is the feature vector +.>
Figure QLYQS_17
The method comprises the steps of carrying out a first treatment on the surface of the Wherein the feature vector->
Figure QLYQS_18
The dimension of each point in (a) is expressed as +.>
Figure QLYQS_19
,/>
Figure QLYQS_20
and />
Figure QLYQS_21
Representing the coded coordinates and color features, respectively, the subscript of r represents the feature channel, 1 in the superscript of r represents the mean, and 2 represents the variance.
3. The urban scene semantic segmentation method based on graph convolution and semi-supervised learning network as set forth in claim 2, wherein: the step S3 specifically comprises the following steps:
for the original point set in step S2
Figure QLYQS_22
Using k-NN to find k adjacent points to form neighborhood, and calculating feature vector according to neighborhood of each point>
Figure QLYQS_23
Wherein the dimension of each point is expressed as: />
Figure QLYQS_24
The calculation process is as follows:
Figure QLYQS_25
Figure QLYQS_28
Figure QLYQS_33
wherein ,/>
Figure QLYQS_27
Mean value of neighborhood coordinate channel representing each point, +.>
Figure QLYQS_30
Mean value of neighborhood color channels representing each point, < >>
Figure QLYQS_31
Representing the variance of the neighborhood color channel for each point,/->
Figure QLYQS_32
Taking 1,2,3, the self-learning process is set with +.>
Figure QLYQS_26
And->
Figure QLYQS_29
Each feature channel corresponds to a feature distance calculated, and n represents an index of k neighboring points of each point.
4. The urban scene semantic segmentation method based on graph convolution and semi-supervised learning network as set forth in claim 3, wherein: the step S4 specifically includes:
let us say that the original set of points entered into the graph rolling network in step S2
Figure QLYQS_34
Contains->
Figure QLYQS_35
The coordinate distance is calculated as the Euclidean distance, and the color distance is calculated as the Manhattan distance;
loss function of coordinate distance
Figure QLYQS_36
The method comprises the following steps:
Figure QLYQS_37
loss function of color distance->
Figure QLYQS_45
The method comprises the following steps: />
Figure QLYQS_46
Finally, the loss function is: />
Figure QLYQS_38
wherein ,/>
Figure QLYQS_42
For the original point set->
Figure QLYQS_43
An index of each of the points in the (c),
Figure QLYQS_44
and />
Figure QLYQS_39
Is two superparameters +.in the graph roll-up network>
Figure QLYQS_40
and />
Figure QLYQS_41
Are respectively set to 1/3 and 2/3; the loss function is used to train the graph convolution network in step S2 to further adjust the parameters of its encoder and decoder.
5. The urban scene semantic segmentation method based on graph convolution and semi-supervised learning network as set forth in claim 4, wherein: the step S5 specifically comprises the following steps:
s51, gathering the original points
Figure QLYQS_48
The target semantic segmentation dataset is +.>
Figure QLYQS_53
Then->
Figure QLYQS_57
Is a group containing->
Figure QLYQS_49
The point of each point is set with the original point set +.>
Figure QLYQS_51
The point set with tag data in is +.>
Figure QLYQS_55
The number of points is +.>
Figure QLYQS_56
The point set of the unlabeled data is +.>
Figure QLYQS_47
The number of points is +.>
Figure QLYQS_50
There is->
Figure QLYQS_52
and />
Figure QLYQS_54
S52, using the encoder and decoder trained and adjusted in the step S4, outputting in the step S4
Figure QLYQS_58
The MLP of dimension is replaced by output +.>
Figure QLYQS_59
Dimension MLP, and will output +.>
Figure QLYQS_60
The dimension vector is denoted as->
Figure QLYQS_61
S53, will
Figure QLYQS_62
The feature corresponding to the point containing the tag is expressed as +.>
Figure QLYQS_63
The feature corresponding to the unlabeled dot is denoted +.>
Figure QLYQS_64
The method comprises the steps of carrying out a first treatment on the surface of the Then
Figure QLYQS_65
wherein ,
Figure QLYQS_67
and />
Figure QLYQS_70
All are->
Figure QLYQS_72
Vectors of dimensions and using indices for distinguishing between different points, and +.>
Figure QLYQS_68
The composition contains->
Figure QLYQS_69
Category 0 </o>
Figure QLYQS_73
≤/>
Figure QLYQS_74
,/>
Figure QLYQS_66
Is->
Figure QLYQS_71
The actual category number needed to be semantically divided;
s54, selecting the data belonging to the category from the known tagged data
Figure QLYQS_75
And calculating the feature average of these points to obtain the class average feature vector +.>
Figure QLYQS_76
Figure QLYQS_78
wherein ,/>
Figure QLYQS_82
The expression category is +.>
Figure QLYQS_86
The number of points, +.>
Figure QLYQS_79
Representation->
Figure QLYQS_80
The category of the corresponding point is->
Figure QLYQS_83
Then, input +.>
Figure QLYQS_84
Calculating an average eigenvector +.>
Figure QLYQS_77
, wherein
Figure QLYQS_81
The method comprises the steps of carrying out a first treatment on the surface of the For->
Figure QLYQS_85
In the remaining non-existent categories,/->
Figure QLYQS_87
Marking as zero vector;
s55, calculating the point of the label-free data
Figure QLYQS_88
Feature vector +.>
Figure QLYQS_89
And->
Figure QLYQS_90
Similarity matrix->
Figure QLYQS_91
Figure QLYQS_93
wherein ,/>
Figure QLYQS_95
Euclidean distance of average feature vector representing category and vector corresponding to unlabeled point, ++>
Figure QLYQS_97
The superscript of (1) indicates category, ">
Figure QLYQS_94
The subscript of (2) indicates a certain point, and
Figure QLYQS_96
,/>
Figure QLYQS_98
the base representing the natural logarithm, the index in brackets,/-for it>
Figure QLYQS_99
Is +.>
Figure QLYQS_92
S56, the feature vector in the step S53
Figure QLYQS_100
Mapping to vector +.>
Figure QLYQS_101
As a result of the prediction, < > for>
Figure QLYQS_102
Is>
Figure QLYQS_103
For the following
Figure QLYQS_104
Labeled points, class prediction is directly realized by using a Softmax classifier and a cross entropy loss function, and the loss function calculated by the points is +.>
Figure QLYQS_105
For the following
Figure QLYQS_107
The unlabeled dots are first generated into pseudo-labels and then used for sum +.>
Figure QLYQS_111
Comparison, specific: first selecting similarity matrix category by category>
Figure QLYQS_112
Highest confidence +.>
Figure QLYQS_108
Dots, assume common selection->
Figure QLYQS_110
Dots (/ -)>
Figure QLYQS_113
≤/>
Figure QLYQS_114
≤/>
Figure QLYQS_106
) Then selecting the category with highest confidence level for the selected points point by point, and updating the category>
Figure QLYQS_109
Maximum confidence of pseudo tags of the points and corresponding tag values;
will be
Figure QLYQS_115
The predictive loss function of each unlabeled dot is designed to:
Figure QLYQS_116
wherein the subscript->
Figure QLYQS_119
Representation->
Figure QLYQS_120
Any one of the points, s represents +.>
Figure QLYQS_118
Index of individual unlabeled dots, +.>
Figure QLYQS_122
Is->
Figure QLYQS_124
The number of categories contained in the table, m represents an index of the number of categories,
Figure QLYQS_125
probability value representing final predictive label, +.>
Figure QLYQS_117
Representing a pseudo tag category, and when the pseudo tag category and the predicted category are the same,mtaking 1, otherwise taking 0,>
Figure QLYQS_121
indicating when the point is +.>
Figure QLYQS_123
Taking 1 when in, otherwise taking 0;
s57, loss function of whole graph convolution network
Figure QLYQS_126
The method comprises the following steps:
Figure QLYQS_127
wherein the weight->
Figure QLYQS_128
The method comprises the following steps: />
Figure QLYQS_129
wherein ,epochrepresenting the current training round of the present time,max-epochrepresents the maximum training round, initially +.>
Figure QLYQS_130
Less weight is used.
6. The urban scene semantic segmentation method based on graph convolution and semi-supervised learning network as set forth in claim 5, wherein: the step S6 specifically includes:
iterating through the process training network in which pseudo tags are assigned in steps S51-S57 until a target data set is reached
Figure QLYQS_131
And (3) performing upper convergence, and removing a similarity matrix by using the trained network in the final prediction process>
Figure QLYQS_132
Is (are) calculated for->
Figure QLYQS_133
All points in (1) were using a Softmax classifier and then read +.>
Figure QLYQS_134
The rest of the data set is iterated to realize the goal data set +.>
Figure QLYQS_135
Semantic segmentation and class prediction of all points.
CN202310596881.7A 2023-05-25 2023-05-25 Urban scene semantic segmentation method based on graph convolution and semi-supervised learning network Active CN116310350B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310596881.7A CN116310350B (en) 2023-05-25 2023-05-25 Urban scene semantic segmentation method based on graph convolution and semi-supervised learning network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310596881.7A CN116310350B (en) 2023-05-25 2023-05-25 Urban scene semantic segmentation method based on graph convolution and semi-supervised learning network

Publications (2)

Publication Number Publication Date
CN116310350A true CN116310350A (en) 2023-06-23
CN116310350B CN116310350B (en) 2023-08-18

Family

ID=86785552

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310596881.7A Active CN116310350B (en) 2023-05-25 2023-05-25 Urban scene semantic segmentation method based on graph convolution and semi-supervised learning network

Country Status (1)

Country Link
CN (1) CN116310350B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116863432A (en) * 2023-09-04 2023-10-10 之江实验室 Weak supervision laser travelable region prediction method and system based on deep learning
CN117576217A (en) * 2024-01-12 2024-02-20 电子科技大学 Object pose estimation method based on single-instance image reconstruction

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070779A (en) * 2020-08-04 2020-12-11 武汉大学 Remote sensing image road segmentation method based on convolutional neural network weak supervised learning
CN112785611A (en) * 2021-01-29 2021-05-11 昆明理工大学 3D point cloud weak supervision semantic segmentation method and system
CN112861722A (en) * 2021-02-09 2021-05-28 中国科学院地理科学与资源研究所 Remote sensing land utilization semantic segmentation method based on semi-supervised depth map convolution
CN113936217A (en) * 2021-10-25 2022-01-14 华中师范大学 Priori semantic knowledge guided high-resolution remote sensing image weakly supervised building change detection method
CN114187446A (en) * 2021-12-09 2022-03-15 厦门大学 Cross-scene contrast learning weak supervision point cloud semantic segmentation method
US11450008B1 (en) * 2020-02-27 2022-09-20 Amazon Technologies, Inc. Segmentation using attention-weighted loss and discriminative feature learning
US20220375187A1 (en) * 2021-07-26 2022-11-24 Beijing Baidu Netcom Science Technology Co., Ltd. Method of performing object segmentation on video using semantic segmentation model, device and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11450008B1 (en) * 2020-02-27 2022-09-20 Amazon Technologies, Inc. Segmentation using attention-weighted loss and discriminative feature learning
CN112070779A (en) * 2020-08-04 2020-12-11 武汉大学 Remote sensing image road segmentation method based on convolutional neural network weak supervised learning
CN112785611A (en) * 2021-01-29 2021-05-11 昆明理工大学 3D point cloud weak supervision semantic segmentation method and system
CN112861722A (en) * 2021-02-09 2021-05-28 中国科学院地理科学与资源研究所 Remote sensing land utilization semantic segmentation method based on semi-supervised depth map convolution
US20220375187A1 (en) * 2021-07-26 2022-11-24 Beijing Baidu Netcom Science Technology Co., Ltd. Method of performing object segmentation on video using semantic segmentation model, device and storage medium
CN113936217A (en) * 2021-10-25 2022-01-14 华中师范大学 Priori semantic knowledge guided high-resolution remote sensing image weakly supervised building change detection method
CN114187446A (en) * 2021-12-09 2022-03-15 厦门大学 Cross-scene contrast learning weak supervision point cloud semantic segmentation method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116863432A (en) * 2023-09-04 2023-10-10 之江实验室 Weak supervision laser travelable region prediction method and system based on deep learning
CN116863432B (en) * 2023-09-04 2023-12-22 之江实验室 Weak supervision laser travelable region prediction method and system based on deep learning
CN117576217A (en) * 2024-01-12 2024-02-20 电子科技大学 Object pose estimation method based on single-instance image reconstruction
CN117576217B (en) * 2024-01-12 2024-03-26 电子科技大学 Object pose estimation method based on single-instance image reconstruction

Also Published As

Publication number Publication date
CN116310350B (en) 2023-08-18

Similar Documents

Publication Publication Date Title
CN116310350B (en) Urban scene semantic segmentation method based on graph convolution and semi-supervised learning network
CN111612066B (en) Remote sensing image classification method based on depth fusion convolutional neural network
CN110796168A (en) Improved YOLOv 3-based vehicle detection method
CN111914611B (en) Urban green space high-resolution remote sensing monitoring method and system
CN113487066B (en) Long-time-sequence freight volume prediction method based on multi-attribute enhanced graph convolution-Informer model
CN112149547B (en) Remote sensing image water body identification method based on image pyramid guidance and pixel pair matching
CN112507793A (en) Ultra-short-term photovoltaic power prediction method
CN112132149B (en) Semantic segmentation method and device for remote sensing image
CN111368846B (en) Road ponding identification method based on boundary semantic segmentation
CN110853057B (en) Aerial image segmentation method based on global and multi-scale full-convolution network
CN113449594A (en) Multilayer network combined remote sensing image ground semantic segmentation and area calculation method
CN113256649B (en) Remote sensing image station selection and line selection semantic segmentation method based on deep learning
CN115482491B (en) Bridge defect identification method and system based on transformer
CN111967325A (en) Unsupervised cross-domain pedestrian re-identification method based on incremental optimization
CN113591617B (en) Deep learning-based water surface small target detection and classification method
CN112712052A (en) Method for detecting and identifying weak target in airport panoramic video
CN114299286A (en) Road scene semantic segmentation method based on category grouping in abnormal weather
Tian et al. Semantic segmentation of remote sensing image based on GAN and FCN network model
CN117237660A (en) Point cloud data processing and segmentation method based on deep learning feature aggregation
CN117011701A (en) Remote sensing image feature extraction method for hierarchical feature autonomous learning
Yao et al. Cloud Detection in Optical Remote Sensing Images with Deep Semi-supervised and Active Learning
CN111368843A (en) Method for extracting lake on ice based on semantic segmentation
CN115965867A (en) Remote sensing image earth surface coverage classification method based on pseudo label and category dictionary learning
CN114694019A (en) Remote sensing image building migration extraction method based on anomaly detection
Wang et al. Quantitative Evaluation of Plant and Modern Urban Landscape Spatial Scale Based on Multiscale Convolutional Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant