CN109635642A - A kind of road scene dividing method based on residual error network and expansion convolution - Google Patents
A kind of road scene dividing method based on residual error network and expansion convolution Download PDFInfo
- Publication number
- CN109635642A CN109635642A CN201811293377.5A CN201811293377A CN109635642A CN 109635642 A CN109635642 A CN 109635642A CN 201811293377 A CN201811293377 A CN 201811293377A CN 109635642 A CN109635642 A CN 109635642A
- Authority
- CN
- China
- Prior art keywords
- residual block
- layer
- road scene
- semantic segmentation
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims abstract description 131
- 230000011218 segmentation Effects 0.000 claims abstract description 76
- 238000012549 training Methods 0.000 claims abstract description 46
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 31
- 238000012360 testing method Methods 0.000 claims abstract description 11
- 230000010339 dilation Effects 0.000 claims description 20
- 230000008569 process Effects 0.000 claims description 12
- 238000010586 diagram Methods 0.000 claims description 4
- 241000287196 Asthenes Species 0.000 claims 1
- 238000010276 construction Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 12
- 238000013135 deep learning Methods 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/20—Scenes; Scene-specific elements in augmented reality scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/35—Categorising the entire scene, e.g. birthday party or wedding scene
- G06V20/38—Outdoor scenes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明涉及一种深度学习的语义分割技术,尤其是涉及一种基于残差网络和扩张卷积的道路场景分割方法。The invention relates to a deep learning semantic segmentation technology, in particular to a road scene segmentation method based on residual network and dilated convolution.
背景技术Background technique
深度学习是人工神经网络的一个分支,具有深度网络结构的人工神经网络是深度学习最早的网络模型。最初,深度学习的应用主要是在图像和语音领域。自2006年以来,深度学习在学术界持续升温,深度学习和神经网络在语义分割、计算机视觉、语音识别、跟踪方面都有极广泛的应用,其极强的高效性也使得它在实时应用等各方面具有巨大的潜力。Deep learning is a branch of artificial neural network, and artificial neural network with deep network structure is the earliest network model of deep learning. Initially, the applications of deep learning were mainly in the fields of images and speech. Since 2006, deep learning has continued to heat up in academia. Deep learning and neural networks have been widely used in semantic segmentation, computer vision, speech recognition, and tracking. Its high efficiency also makes it suitable for real-time applications, etc. There is huge potential in all aspects.
卷积神经网络在图像的分类、定位以及场景理解等方面取得了成功。随着增强现实和自动驾驶车辆等任务的激增,许多研究人员将注意力转移到场景理解上,其中一个主要步骤就是语义分割,即对所给定图像中的每个像素点做分类。语义分割在移动和机器人相关应用中具有重要意义。Convolutional neural networks have achieved success in image classification, localization, and scene understanding. With the proliferation of tasks such as augmented reality and self-driving vehicles, many researchers have turned their attention to scene understanding, and one of the main steps is semantic segmentation, which is to classify each pixel in a given image. Semantic segmentation is of great importance in mobile and robotics related applications.
语义分割问题在很多应用场景中都有着十分重要的作用,例如图片理解、自动驾驶等,所以近年来,语义分割问题在学术界和工业界得到了广泛的关注。经典的语义分割方法有全连接网络(Full Connected Network,FCN)和卷积神经网络SegNet等,这些方法在道路场景分割数据库上的像素精度、均像素精度和均交并比均有不错的表现。但是,FCN的一个不足之处在于,由于池化层的存在,导致响应张量的大小(长和宽)越来越小,然而FCN的设计初衷则需要和输入大小一致的输出,因此FCN做了上采样,但是上采样并不能将丢失的信息全部无损地找回来;卷积神经网络SegNet是在FCN的基础上构建的网络模型,然而其并没有很好地控制信息丢失这个问题。因此,这些方法因信息丢失影响了分割精确度,从而导致方法的鲁棒性降低。The problem of semantic segmentation plays a very important role in many application scenarios, such as image understanding, automatic driving, etc. Therefore, in recent years, the problem of semantic segmentation has received extensive attention in academia and industry. Classical semantic segmentation methods include Full Connected Network (FCN) and Convolutional Neural Network SegNet, etc. These methods have good performance in pixel accuracy, average pixel accuracy and average intersection ratio on road scene segmentation databases. However, one of the shortcomings of FCN is that due to the existence of the pooling layer, the size (length and width) of the response tensor is getting smaller and smaller. However, the original design of FCN needs to output the same size as the input, so FCN does Upsampling is used, but upsampling cannot recover all the lost information losslessly; the convolutional neural network SegNet is a network model built on the basis of FCN, but it does not control the problem of information loss well. Therefore, these methods affect the segmentation accuracy due to the loss of information, which leads to a decrease in the robustness of the method.
发明内容SUMMARY OF THE INVENTION
本发明所要解决的技术问题是提供一种基于残差网络和扩张卷积的道路场景分割方法,其计算复杂度低、分割效率高、分割精度高,且鲁棒性好。The technical problem to be solved by the present invention is to provide a road scene segmentation method based on residual network and dilated convolution, which has low computational complexity, high segmentation efficiency, high segmentation accuracy and good robustness.
本发明解决上述技术问题所采用的技术方案为:一种基于残差网络和扩张卷积的道路场景分割方法,其特征在于包括训练阶段和测试阶段两个过程;The technical solution adopted by the present invention to solve the above-mentioned technical problems is: a road scene segmentation method based on residual network and dilated convolution, which is characterized in that it includes two processes: a training phase and a testing phase;
所述的训练阶段过程的具体步骤为:The specific steps of the training phase process are:
步骤1_1:选取Q幅原始的道路场景图像及每幅原始的道路场景图像对应的真实语义分割图像,并构成训练集,将训练集中的第q幅原始的道路场景图像记为{Iq(i,j)},将训练集中与{Iq(i,j)}对应的真实语义分割图像记为然后采用独热编码技术将训练集中的每幅原始的道路场景图像对应的真实语义分割图像处理成12幅独热编码图像,将处理成的12幅独热编码图像构成的集合记为其中,道路场景图像为RGB彩色图像,Q为正整数,Q≥100,q为正整数,1≤q≤Q,1≤i≤W,1≤j≤H,W表示{Iq(i,j)}的宽度,H表示{Iq(i,j)}的高度,Iq(i,j)表示{Iq(i,j)}中坐标位置为(i,j)的像素点的像素值,表示中坐标位置为(i,j)的像素点的像素值;Step 1_1: Select Q original road scene images and the real semantic segmentation images corresponding to each original road scene image, and form a training set, and mark the qth original road scene image in the training set as {I q (i ,j)}, denote the real semantic segmentation images corresponding to {I q (i, j)} in the training set as Then, the one-hot encoding technology is used to process the real semantic segmentation images corresponding to each original road scene image in the training set into 12 one-hot encoded images. The set of 12 processed one-hot encoded images is denoted as Among them, the road scene image is an RGB color image, Q is a positive integer, Q≥100, q is a positive integer, 1≤q≤Q, 1≤i≤W, 1≤j≤H, W represents {I q (i, j)}, H represents the height of {I q (i, j)}, I q (i, j) represents the pixel point at the coordinate position (i, j) in {I q (i, j)} Pixel values, express The pixel value of the pixel whose middle coordinate position is (i, j);
步骤1_2:构建卷积神经网络:卷积神经网络包括输入层、隐层和输出层;隐层由10个依次设置的Residual block组成,其中,第1个Residual block中的每个卷积层通过设置扩张率为1形成扩张卷积层,第2个Residual block中的每个卷积层通过设置扩张率为1形成扩张卷积层,第3个Residual block中的每个卷积层通过设置扩张率为2形成扩张卷积层,第4个Residual block中的每个卷积层通过设置扩张率为2形成扩张卷积层,第5个Residual block中的每个卷积层通过设置扩张率为4形成扩张卷积层,第6个Residualblock中的每个卷积层通过设置扩张率为4形成扩张卷积层,第7个Residual block中的每个卷积层通过设置扩张率为2形成扩张卷积层,第8个Residual block中的每个卷积层通过设置扩张率为2形成扩张卷积层,第9个Residual block中的每个卷积层通过设置扩张率为1形成扩张卷积层,第10个Residual block中的每个卷积层通过设置扩张率为1形成扩张卷积层;Step 1_2: Build a convolutional neural network: The convolutional neural network includes an input layer, a hidden layer and an output layer; the hidden layer consists of 10 Residual blocks set in sequence, among which, each convolutional layer in the first Residual block passes through Set the dilation rate to 1 to form a dilated convolutional layer, each convolutional layer in the second Residual block forms a dilated convolutional layer by setting the dilation rate to 1, and each convolutional layer in the third Residual block is dilated by setting The rate of 2 forms a dilated convolution layer, each convolution layer in the fourth Residual block forms an expanded convolution layer by setting the expansion rate to 2, and each convolution layer in the fifth Residual block is set by setting the expansion rate. 4 forms a dilated convolutional layer, each convolutional layer in the sixth Residualblock forms a dilated convolutional layer by setting a dilation rate of 4, and each convolutional layer in the seventh Residual block forms a dilation by setting a dilation rate of 2. Convolutional layer, each convolutional layer in the 8th Residual block forms a dilated convolutional layer by setting the dilation rate to 2, and each convolutional layer in the 9th Residual block forms a dilated convolutional layer by setting the dilation rate to 1 layer, each convolutional layer in the 10th Residual block forms a dilated convolutional layer by setting the dilation rate to 1;
对于输入层,输入层的输入端接收一幅原始输入图像的R通道分量、G通道分量和B通道分量,输入层的输出端输出原始输入图像的R通道分量、G通道分量和B通道分量给隐层;其中,要求输入层的输入端接收的原始输入图像的宽度为W、高度为H;For the input layer, the input end of the input layer receives the R channel component, G channel component and B channel component of an original input image, and the output end of the input layer outputs the R channel component, G channel component and B channel component of the original input image to Hidden layer; wherein, the width of the original input image received by the input end of the input layer is required to be W and the height is H;
对于第1个Residual block,第1个Residual block的输入端接收输入层的输出端输出的原始输入图像的R通道分量、G通道分量和B通道分量,第1个Residual block的输出端输出32幅特征图,将32幅特征图构成的集合记为R1;其中,R1中的每幅特征图的宽度为W、高度为H;For the first Residual block, the input end of the first Residual block receives the R channel component, G channel component and B channel component of the original input image output by the output end of the input layer, and the output end of the first Residual block outputs 32 images Feature map, the set composed of 32 feature maps is denoted as R 1 ; wherein, the width of each feature map in R 1 is W and the height is H;
对于第2个Residual block,第2个Residual block的输入端接收R1中的所有特征图,第2个Residual block的输出端输出32幅特征图,将32幅特征图构成的集合记为R2;其中,R2中的每幅特征图的宽度为W、高度为H;For the second Residual block, the input of the second Residual block receives all the feature maps in R 1 , the output of the second Residual block outputs 32 feature maps, and the set of 32 feature maps is recorded as R 2 ; wherein, the width of each feature map in R 2 is W and the height is H;
对于第3个Residual block,第3个Residual block的输入端接收R2中的所有特征图,第3个Residual block的输出端输出64幅特征图,将64幅特征图构成的集合记为R3;其中,R3中的每幅特征图的宽度为W、高度为H;For the third Residual block, the input of the third Residual block receives all the feature maps in R 2 , the output of the third Residual block outputs 64 feature maps, and the set of 64 feature maps is recorded as R 3 ; wherein, the width of each feature map in R 3 is W and the height is H;
对于第4个Residual block,第4个Residual block的输入端接收R3中的所有特征图,第4个Residual block的输出端输出64幅特征图,将64幅特征图构成的集合记为R4;其中,R4中的每幅特征图的宽度为W、高度为H;For the 4th Residual block, the input of the 4th Residual block receives all the feature maps in R3 , the output of the 4th Residual block outputs 64 feature maps, and the set of 64 feature maps is recorded as R4 ; wherein, the width of each feature map in R 4 is W and the height is H;
对于第5个Residual block,第5个Residual block的输入端接收R4中的所有特征图,第5个Residual block的输出端输出128幅特征图,将128幅特征图构成的集合记为R5;其中,R5中的每幅特征图的宽度为W、高度为H;For the 5th Residual block, the input of the 5th Residual block receives all the feature maps in R 4 , the output of the 5th Residual block outputs 128 feature maps, and the set of 128 feature maps is recorded as R 5 ; wherein, the width of each feature map in R 5 is W and the height is H;
对于第6个Residual block,第6个Residual block的输入端接收R5中的所有特征图,第6个Residual block的输出端输出128幅特征图,将128幅特征图构成的集合记为R6;其中,R6中的每幅特征图的宽度为W、高度为H;For the 6th Residual block, the input of the 6th Residual block receives all the feature maps in R 5 , the output of the 6th Residual block outputs 128 feature maps, and the set of 128 feature maps is recorded as R 6 ; wherein, the width of each feature map in R 6 is W and the height is H;
对于第7个Residual block,第7个Residual block的输入端接收R6中的所有特征图,第7个Residual block的输出端输出64幅特征图,将64幅特征图构成的集合记为R7;其中,R7中的每幅特征图的宽度为W、高度为H;For the 7th Residual block, the input of the 7th Residual block receives all the feature maps in R 6 , the output of the 7th Residual block outputs 64 feature maps, and the set of 64 feature maps is recorded as R 7 ; wherein, the width of each feature map in R 7 is W and the height is H;
对于第8个Residual block,第8个Residual block的输入端接收R7中的所有特征图,第8个Residual block的输出端输出64幅特征图,将64幅特征图构成的集合记为R8;其中,R8中的每幅特征图的宽度为W、高度为H;For the 8th Residual block, the input of the 8th Residual block receives all the feature maps in R 7 , the output of the 8th Residual block outputs 64 feature maps, and the set of 64 feature maps is recorded as R 8 ; wherein, the width of each feature map in R 8 is W and the height is H;
对于第9个Residual block,第9个Residual block的输入端接收R8中的所有特征图,第9个Residual block的输出端输出32幅特征图,将32幅特征图构成的集合记为R9;其中,R9中的每幅特征图的宽度为W、高度为H;For the ninth Residual block, the input of the ninth Residual block receives all the feature maps in R 8 , the output of the ninth Residual block outputs 32 feature maps, and the set of 32 feature maps is recorded as R 9 ; wherein, the width of each feature map in R 9 is W and the height is H;
对于第10个Residual block,第10个Residual block的输入端接收R9中的所有特征图,第10个Residual block的输出端输出32幅特征图,将32幅特征图构成的集合记为R10;其中,R10中的每幅特征图的宽度为W、高度为H;For the 10th Residual block, the input of the 10th Residual block receives all the feature maps in R 9 , the output of the 10th Residual block outputs 32 feature maps, and the set of 32 feature maps is recorded as R 10 ; wherein, the width of each feature map in R 10 is W and the height is H;
对于输出层,其由1个卷积层组成,输出层的输入端接收R10中的所有特征图,输出层的输出端输出12幅与原始输入图像对应的语义分割预测图;其中,每幅语义分割预测图的宽度为W、高度为H;For the output layer, it consists of 1 convolutional layer, the input of the output layer receives all the feature maps in R 10 , and the output of the output layer outputs 12 semantic segmentation prediction maps corresponding to the original input image; The width of the semantic segmentation prediction map is W and the height is H;
步骤1_3:将训练集中的每幅原始的道路场景图像作为原始输入图像,输入到卷积神经网络中进行训练,得到训练集中的每幅原始的道路场景图像对应的12幅语义分割预测图,将{Iq(i,j)}对应的12幅语义分割预测图构成的集合记为 Step 1_3: Take each original road scene image in the training set as the original input image, input it into the convolutional neural network for training, and obtain 12 semantic segmentation prediction maps corresponding to each original road scene image in the training set. The set of 12 semantic segmentation prediction maps corresponding to {I q (i, j)} is denoted as
步骤1_4:计算训练集中的每幅原始的道路场景图像对应的12幅语义分割预测图构成的集合与对应的真实语义分割图像处理成的12幅独热编码图像构成的集合之间的损失函数值,将与之间的损失函数值记为 Step 1_4: Calculate the loss function value between the set of 12 semantic segmentation prediction images corresponding to each original road scene image in the training set and the set of 12 one-hot encoded images processed from the corresponding real semantic segmentation image. ,Will and The loss function value between is denoted as
步骤1_5:重复执行步骤1_3和步骤1_4共V次,得到卷积神经网络分类训练模型,并共得到Q×V个损失函数值;然后从Q×V个损失函数值中找出值最小的损失函数值;接着将值最小的损失函数值对应的权值矢量和偏置项对应作为卷积神经网络分类训练模型的最优权值矢量和最优偏置项,对应记为Wbest和bbest;其中,V>1;Step 1_5: Repeat steps 1_3 and 1_4 for a total of V times to obtain a convolutional neural network classification training model, and obtain a total of Q×V loss function values; then find the loss with the smallest value from the Q×V loss function values. function value; then the weight vector and bias term corresponding to the loss function value with the smallest value are corresponding to the optimal weight vector and optimal bias term of the convolutional neural network classification training model, which are correspondingly recorded as W best and b best ; Wherein, V>1;
所述的测试阶段过程的具体步骤为:The specific steps of the test phase process are:
步骤2_1:令表示待语义分割的道路场景图像;其中,1≤i'≤W',1≤j'≤H',W'表示的宽度,H'表示的高度,表示中坐标位置为(i,j)的像素点的像素值;Step 2_1: Make Indicates the road scene image to be semantically segmented; where 1≤i'≤W', 1≤j'≤H', W' denotes width, H' means the height of, express The pixel value of the pixel whose middle coordinate position is (i, j);
步骤2_2:将的R通道分量、G通道分量和B通道分量输入到卷积神经网络分类训练模型中,并利用Wbest和bbest进行预测,得到对应的预测语义分割图像,记为其中,表示中坐标位置为(i',j')的像素点的像素值。Step 2_2: Put the The R channel component, G channel component and B channel component are input into the convolutional neural network classification training model, and use W best and b best to predict, get The corresponding predicted semantic segmentation image, denoted as in, express The pixel value of the pixel whose middle coordinate position is (i', j').
所述的步骤1_4中,采用分类交叉熵获得。In the described steps 1-4, Obtained using categorical cross-entropy.
与现有技术相比,本发明的优点在于:Compared with the prior art, the advantages of the present invention are:
1)本发明方法在构建卷积神经网络的过程中,引入了ResNet(残差网络)中的嵌入恒等的shortcut连接的Residual block(残差网络块),由堆叠的10个Residual block构成了卷积神经网络的隐层,Residual block的设置增大了特征信息的提取能力,充分吸收了残差网络基本模块的结构高效性,从而提高了训练得到的卷积神经网络分类训练模型的预测准确度。1) In the process of constructing the convolutional neural network, the method of the present invention introduces the Residual block (residual network block) embedded in the identical shortcut connection in the ResNet (residual network), which is composed of 10 stacked Residual blocks. The hidden layer of the convolutional neural network and the setting of the Residual block increase the ability to extract feature information and fully absorb the structural efficiency of the basic module of the residual network, thereby improving the prediction accuracy of the trained convolutional neural network classification training model. Spend.
2)本发明方法构建的卷积神经网络,其隐层仅采用了10个Residual block,大大减少了冗余性及数据量大等一系列问题的代价损失,从而使得其计算复杂度低;每个Residual block中采用了卷积层通过设置扩张率形成的扩张卷积层,扩张卷积很好地避免了因尺寸变换过程中而损失掉的信息,在扩大了感受野的同时,确保了特征图的分辨率不变,极大程度上保留了有效的深度信息,使得训练阶段得到的语义分割预测图和测试阶段得到的预测语义分割图像的分辨率高、边界精确、空间连续性好。2) In the convolutional neural network constructed by the method of the present invention, only 10 Residual blocks are used in the hidden layer, which greatly reduces the cost loss of a series of problems such as redundancy and large amount of data, thereby making its computational complexity low; Each Residual block uses a convolution layer formed by setting the dilation rate. The dilated convolution avoids the loss of information due to the size transformation process. While expanding the receptive field, it ensures the characteristics of The resolution of the graph remains unchanged, and the effective depth information is retained to a large extent, so that the semantic segmentation prediction graph obtained in the training phase and the predicted semantic segmentation image obtained in the testing phase have high resolution, accurate boundaries, and good spatial continuity.
3)本发明方法采用的Residual block,不仅大大提高了特征信息的提取力度,而且防止了模型过拟合,有极强鲁棒性,大大提升了分割效率。3) The Residual block adopted by the method of the present invention not only greatly improves the extraction strength of feature information, but also prevents the model from overfitting, has strong robustness, and greatly improves the segmentation efficiency.
附图说明Description of drawings
图1为本发明方法的总体实现框图;Fig. 1 is the overall realization block diagram of the method of the present invention;
图2为本发明方法创建的卷积神经网络的组成结构示意图;Fig. 2 is the composition structure schematic diagram of the convolutional neural network created by the method of the present invention;
图3a为选取的一幅待语义分割的道路场景图像;Fig. 3a is a selected road scene image to be semantically segmented;
图3b为图3a所示的待语义分割的道路场景图像对应的真实语义分割图像;Fig. 3b is the real semantic segmentation image corresponding to the road scene image to be semantically segmented shown in Fig. 3a;
图3c为利用本发明方法对图3a所示的待语义分割的道路场景图像进行预测,得到的预测语义分割图像;Figure 3c is a predicted semantic segmentation image obtained by using the method of the present invention to predict the road scene image to be semantically segmented as shown in Figure 3a;
图4a为选取的另一幅待语义分割的道路场景图像;Fig. 4a is another selected road scene image to be semantically segmented;
图4b为图4a所示的待语义分割的道路场景图像对应的真实语义分割图像;Fig. 4b is the real semantic segmentation image corresponding to the road scene image to be semantically segmented shown in Fig. 4a;
图4c为利用本发明方法对图4a所示的待语义分割的道路场景图像进行预测,得到的预测语义分割图像。Fig. 4c is a predicted semantic segmentation image obtained by using the method of the present invention to predict the road scene image to be semantically segmented as shown in Fig. 4a.
具体实施方式Detailed ways
以下结合附图实施例对本发明作进一步详细描述。The present invention will be further described in detail below with reference to the embodiments of the accompanying drawings.
本发明提出的一种基于残差网络和扩张卷积的道路场景分割方法,其总体实现框图如图1所示,其包括训练阶段和测试阶段两个过程。The overall implementation block diagram of a road scene segmentation method based on residual network and dilated convolution proposed by the present invention is shown in Figure 1, which includes two processes: a training phase and a testing phase.
所述的训练阶段过程的具体步骤为:The specific steps of the training phase process are:
步骤1_1:选取Q幅原始的道路场景图像及每幅原始的道路场景图像对应的真实语义分割图像,并构成训练集,将训练集中的第q幅原始的道路场景图像记为{Iq(i,j)},将训练集中与{Iq(i,j)}对应的真实语义分割图像记为然后采用现有的独热编码技术(one-hot)将训练集中的每幅原始的道路场景图像对应的真实语义分割图像处理成12幅独热编码图像,将处理成的12幅独热编码图像构成的集合记为其中,道路场景图像为RGB彩色图像,Q为正整数,Q≥100,如取Q=100,q为正整数,1≤q≤Q,1≤i≤W,1≤j≤H,W表示{Iq(i,j)}的宽度,H表示{Iq(i,j)}的高度,如取W=352、H=480,Iq(i,j)表示{Iq(i,j)}中坐标位置为(i,j)的像素点的像素值,表示中坐标位置为(i,j)的像素点的像素值;在此,原始的道路场景图像直接选用道路场景图像数据库CamVid训练集中的100幅图像。Step 1_1: Select Q original road scene images and the real semantic segmentation images corresponding to each original road scene image, and form a training set, and mark the qth original road scene image in the training set as {I q (i ,j)}, denote the real semantic segmentation images corresponding to {I q (i, j)} in the training set as Then, the existing one-hot encoding technology (one-hot) is used to process the real semantic segmentation images corresponding to each original road scene image in the training set into 12 one-hot encoded images. The set of 12 processed one-hot encoded images is denoted as Among them, the road scene image is an RGB color image, Q is a positive integer, Q≥100, if Q=100, q is a positive integer, 1≤q≤Q, 1≤i≤W, 1≤j≤H, W means The width of {I q (i, j)}, H represents the height of {I q (i, j)}, such as W=352, H=480, I q (i, j) represents {I q (i, j) j)} The pixel value of the pixel whose coordinate position is (i, j), express The pixel value of the pixel point whose middle coordinate position is (i, j); here, the original road scene image directly selects 100 images in the training set of the road scene image database CamVid.
步骤1_2:构建卷积神经网络:如图2所示,卷积神经网络包括输入层、隐层和输出层;隐层由10个依次设置的Residual block(残差网络块)组成,其中,第1个Residualblock中的每个卷积层通过设置扩张率(dilation rate)为1形成扩张卷积层,第2个Residual block中的每个卷积层通过设置扩张率为1形成扩张卷积层,第3个Residualblock中的每个卷积层通过设置扩张率为2形成扩张卷积层,第4个Residual block中的每个卷积层通过设置扩张率为2形成扩张卷积层,第5个Residual block中的每个卷积层通过设置扩张率为4形成扩张卷积层,第6个Residual block中的每个卷积层通过设置扩张率为4形成扩张卷积层,第7个Residual block中的每个卷积层通过设置扩张率为2形成扩张卷积层,第8个Residual block中的每个卷积层通过设置扩张率为2形成扩张卷积层,第9个Residual block中的每个卷积层通过设置扩张率为1形成扩张卷积层,第10个Residualblock中的每个卷积层通过设置扩张率为1形成扩张卷积层,10个Residual block中的扩张卷积层的卷积核大小保持不变均为3×3。Step 1_2: Build a convolutional neural network: As shown in Figure 2, the convolutional neural network includes an input layer, a hidden layer and an output layer; the hidden layer consists of 10 Residual blocks (residual network blocks) set in sequence, among which, the first Each convolutional layer in one Residual block forms a dilated convolutional layer by setting the dilation rate to 1, and each convolutional layer in the second Residual block forms a dilated convolutional layer by setting the dilation rate to 1, Each convolutional layer in the third Residual block forms a dilated convolutional layer by setting a dilation rate of 2, each convolutional layer in the fourth Residual block forms a dilated convolutional layer by setting a dilation rate of 2, and the fifth Each convolutional layer in the Residual block forms a dilated convolutional layer by setting the dilation rate to 4, and each convolutional layer in the sixth Residual block forms a dilated convolutional layer by setting the dilation rate to 4. The seventh Residual block forms a dilated convolutional layer. Each convolutional layer in the convolutional layer forms a dilated convolutional layer by setting the dilation rate to 2, and each convolutional layer in the 8th Residual block forms a dilated convolutional layer by setting the dilation rate to 2. In the ninth Residual block, the Each convolutional layer forms a dilated convolutional layer by setting the dilation rate to 1, each convolutional layer in the 10th Residualblock forms a dilated convolutional layer by setting the dilation rate to 1, and the dilated convolutional layer in the 10 Residual blocks The size of the convolution kernel remains the same as 3 × 3.
对于输入层,输入层的输入端接收一幅原始输入图像的R通道分量、G通道分量和B通道分量,输入层的输出端输出原始输入图像的R通道分量、G通道分量和B通道分量给隐层;其中,要求输入层的输入端接收的原始输入图像的宽度为W、高度为H。For the input layer, the input end of the input layer receives the R channel component, G channel component and B channel component of an original input image, and the output end of the input layer outputs the R channel component, G channel component and B channel component of the original input image to Hidden layer; wherein, the width of the original input image received by the input end of the input layer is required to be W and the height is H.
对于第1个Residual block,第1个Residual block的输入端接收输入层的输出端输出的原始输入图像的R通道分量、G通道分量和B通道分量,第1个Residual block的输出端输出32幅特征图,将32幅特征图构成的集合记为R1;其中,R1中的每幅特征图的宽度为W、高度为H。For the first Residual block, the input end of the first Residual block receives the R channel component, G channel component and B channel component of the original input image output by the output end of the input layer, and the output end of the first Residual block outputs 32 images For feature maps, the set of 32 feature maps is denoted as R 1 ; wherein, the width of each feature map in R 1 is W and the height is H.
对于第2个Residual block,第2个Residual block的输入端接收R1中的所有特征图,第2个Residual block的输出端输出32幅特征图,将32幅特征图构成的集合记为R2;其中,R2中的每幅特征图的宽度为W、高度为H。For the second Residual block, the input of the second Residual block receives all the feature maps in R 1 , the output of the second Residual block outputs 32 feature maps, and the set of 32 feature maps is recorded as R 2 ; where, the width of each feature map in R 2 is W and the height is H.
对于第3个Residual block,第3个Residual block的输入端接收R2中的所有特征图,第3个Residual block的输出端输出64幅特征图,将64幅特征图构成的集合记为R3;其中,R3中的每幅特征图的宽度为W、高度为H。For the third Residual block, the input of the third Residual block receives all the feature maps in R 2 , the output of the third Residual block outputs 64 feature maps, and the set of 64 feature maps is recorded as R 3 ; where, the width of each feature map in R 3 is W and the height is H.
对于第4个Residual block,第4个Residual block的输入端接收R3中的所有特征图,第4个Residual block的输出端输出64幅特征图,将64幅特征图构成的集合记为R4;其中,R4中的每幅特征图的宽度为W、高度为H。For the 4th Residual block, the input of the 4th Residual block receives all the feature maps in R3 , the output of the 4th Residual block outputs 64 feature maps, and the set of 64 feature maps is recorded as R4 ; where, the width of each feature map in R 4 is W and the height is H.
对于第5个Residual block,第5个Residual block的输入端接收R4中的所有特征图,第5个Residual block的输出端输出128幅特征图,将128幅特征图构成的集合记为R5;其中,R5中的每幅特征图的宽度为W、高度为H。For the 5th Residual block, the input of the 5th Residual block receives all the feature maps in R 4 , the output of the 5th Residual block outputs 128 feature maps, and the set of 128 feature maps is recorded as R 5 ; where, the width of each feature map in R 5 is W and the height is H.
对于第6个Residual block,第6个Residual block的输入端接收R5中的所有特征图,第6个Residual block的输出端输出128幅特征图,将128幅特征图构成的集合记为R6;其中,R6中的每幅特征图的宽度为W、高度为H。For the 6th Residual block, the input of the 6th Residual block receives all the feature maps in R 5 , the output of the 6th Residual block outputs 128 feature maps, and the set of 128 feature maps is recorded as R 6 ; where, the width of each feature map in R 6 is W and the height is H.
对于第7个Residual block,第7个Residual block的输入端接收R6中的所有特征图,第7个Residual block的输出端输出64幅特征图,将64幅特征图构成的集合记为R7;其中,R7中的每幅特征图的宽度为W、高度为H。For the 7th Residual block, the input of the 7th Residual block receives all the feature maps in R 6 , the output of the 7th Residual block outputs 64 feature maps, and the set of 64 feature maps is recorded as R 7 ; wherein, the width of each feature map in R 7 is W and the height is H.
对于第8个Residual block,第8个Residual block的输入端接收R7中的所有特征图,第8个Residual block的输出端输出64幅特征图,将64幅特征图构成的集合记为R8;其中,R8中的每幅特征图的宽度为W、高度为H。For the 8th Residual block, the input of the 8th Residual block receives all the feature maps in R 7 , the output of the 8th Residual block outputs 64 feature maps, and the set of 64 feature maps is recorded as R 8 ; where, the width of each feature map in R 8 is W and the height is H.
对于第9个Residual block,第9个Residual block的输入端接收R8中的所有特征图,第9个Residual block的输出端输出32幅特征图,将32幅特征图构成的集合记为R9;其中,R9中的每幅特征图的宽度为W、高度为H。For the ninth Residual block, the input of the ninth Residual block receives all the feature maps in R 8 , the output of the ninth Residual block outputs 32 feature maps, and the set of 32 feature maps is recorded as R 9 ; where, the width of each feature map in R 9 is W and the height is H.
对于第10个Residual block,第10个Residual block的输入端接收R9中的所有特征图,第10个Residual block的输出端输出32幅特征图,将32幅特征图构成的集合记为R10;其中,R10中的每幅特征图的宽度为W、高度为H。For the 10th Residual block, the input of the 10th Residual block receives all the feature maps in R 9 , the output of the 10th Residual block outputs 32 feature maps, and the set of 32 feature maps is recorded as R 10 ; where, the width of each feature map in R 10 is W and the height is H.
对于输出层,其由1个卷积层组成,输出层的输入端接收R10中的所有特征图,输出层的输出端输出12幅与原始输入图像对应的语义分割预测图;其中,每幅语义分割预测图的宽度为W、高度为H。For the output layer, it consists of 1 convolutional layer, the input of the output layer receives all the feature maps in R 10 , and the output of the output layer outputs 12 semantic segmentation prediction maps corresponding to the original input image; The width of the semantic segmentation prediction map is W and the height is H.
步骤1_3:将训练集中的每幅原始的道路场景图像作为原始输入图像,输入到卷积神经网络中进行训练,得到训练集中的每幅原始的道路场景图像对应的12幅语义分割预测图,将{Iq(i,j)}对应的12幅语义分割预测图构成的集合记为 Step 1_3: Take each original road scene image in the training set as the original input image, input it into the convolutional neural network for training, and obtain 12 semantic segmentation prediction maps corresponding to each original road scene image in the training set. The set of 12 semantic segmentation prediction maps corresponding to {I q (i, j)} is denoted as
步骤1_4:计算训练集中的每幅原始的道路场景图像对应的12幅语义分割预测图构成的集合与对应的真实语义分割图像处理成的12幅独热编码图像构成的集合之间的损失函数值,将与之间的损失函数值记为采用分类交叉熵(categorical crossentropy)获得。Step 1_4: Calculate the loss function value between the set of 12 semantic segmentation prediction images corresponding to each original road scene image in the training set and the set of 12 one-hot encoded images processed from the corresponding real semantic segmentation image. ,Will and The loss function value between is denoted as Obtained using categorical crossentropy.
步骤1_5:重复执行步骤1_3和步骤1_4共V次,得到卷积神经网络分类训练模型,并共得到Q×V个损失函数值;然后从Q×V个损失函数值中找出值最小的损失函数值;接着将值最小的损失函数值对应的权值矢量和偏置项对应作为卷积神经网络分类训练模型的最优权值矢量和最优偏置项,对应记为Wbest和bbest;其中,V>1,在本实施例中取V=300。Step 1_5: Repeat steps 1_3 and 1_4 for a total of V times to obtain a convolutional neural network classification training model, and obtain a total of Q×V loss function values; then find the loss with the smallest value from the Q×V loss function values. function value; then the weight vector and bias term corresponding to the loss function value with the smallest value are corresponding to the optimal weight vector and optimal bias term of the convolutional neural network classification training model, which are correspondingly recorded as W best and b best ; Wherein, V>1, in this embodiment, take V=300.
所述的测试阶段过程的具体步骤为:The specific steps of the test phase process are:
步骤2_1:令表示待语义分割的道路场景图像;其中,1≤i'≤W',1≤j'≤H',W'表示的宽度,H'表示的高度,表示中坐标位置为(i,j)的像素点的像素值。Step 2_1: Make Indicates the road scene image to be semantically segmented; where 1≤i'≤W', 1≤j'≤H', W' denotes width, H' means the height of, express The pixel value of the pixel whose middle coordinate position is (i, j).
步骤2_2:将的R通道分量、G通道分量和B通道分量输入到卷积神经网络分类训练模型中,并利用Wbest和bbest进行预测,得到对应的预测语义分割图像,记为其中,表示中坐标位置为(i',j')的像素点的像素值。Step 2_2: Put the The R channel component, G channel component and B channel component are input into the convolutional neural network classification training model, and use W best and b best to predict, get The corresponding predicted semantic segmentation image, denoted as in, express The pixel value of the pixel whose middle coordinate position is (i', j').
为了进一步验证本发明方法的可行性和有效性,进行实验。In order to further verify the feasibility and effectiveness of the method of the present invention, experiments were carried out.
使用基于python的深度学习库Keras2.1.5搭建卷积神经网络的架构。采用道路场景图像数据库CamVid测试集来分析利用本发明方法预测得到的道路场景图像的分割效果如何。这里,利用评估语义分割方法的3个常用客观参量作为评价指标,即像素精度(PixelAccuracy,PA)、均像素精度(Mean Pixel Accuracy,MPA)、均交并比(Mean Intersectionover Union,MIoU)来评价预测语义分割图像的分割性能。Use the python-based deep learning library Keras2.1.5 to build the architecture of the convolutional neural network. The road scene image database CamVid test set is used to analyze the segmentation effect of the road scene image predicted by the method of the present invention. Here, three common objective parameters for evaluating semantic segmentation methods are used as evaluation indicators, namely Pixel Accuracy (PA), Mean Pixel Accuracy (MPA), and Mean Intersectionover Union (MIoU) to evaluate Predicting segmentation performance for semantically segmented images.
利用本发明方法对道路场景图像数据库CamVid测试集中的每幅道路场景图像进行预测,得到每幅道路场景图像对应的预测语义分割图像,反映本发明方法的语义分割效果的像素精度PA、均像素精度MPA、均交并比MIoU如表1所列,像素精度PA、均像素精度MPA、均交并比MIoU的值越高,说明有效性和预测准确率越高。从表1所列的数据可知,按本发明方法得到的道路场景图像的分割结果是较好的,表明利用本发明方法来获取道路场景图像对应的预测语义分割图像是可行性且有效的。The method of the present invention is used to predict each road scene image in the test set of the road scene image database CamVid, and the predicted semantic segmentation image corresponding to each road scene image is obtained, and the pixel accuracy PA and the average pixel accuracy of the semantic segmentation effect of the method of the present invention are reflected. MPA, mean intersection ratio MIoU are listed in Table 1, the higher the value of pixel precision PA, mean pixel precision MPA, and mean intersection ratio MIoU, the higher the validity and prediction accuracy. From the data listed in Table 1, it can be seen that the segmentation result of the road scene image obtained by the method of the present invention is good, indicating that the method of the present invention is feasible and effective to obtain the predicted semantic segmentation image corresponding to the road scene image.
表1利用本发明方法在测试集上的评测结果Table 1. Evaluation results on the test set using the method of the present invention
图3a给出了选取的一幅待语义分割的道路场景图像;图3b给出了图3a所示的待语义分割的道路场景图像对应的真实语义分割图像;图3c给出了利用本发明方法对图3a所示的待语义分割的道路场景图像进行预测,得到的预测语义分割图像;图4a给出了选取的另一幅待语义分割的道路场景图像;图4b给出了图4a所示的待语义分割的道路场景图像对应的真实语义分割图像;图4c给出了利用本发明方法对图4a所示的待语义分割的道路场景图像进行预测,得到的预测语义分割图像。对比图3b和图3c,对比图4b和图4c,可以看出利用本发明方法得到的预测语义分割图像的分割精度较高,接近真实语义分割图像。Fig. 3a shows a selected road scene image to be semantically segmented; Fig. 3b shows the real semantic segmentation image corresponding to the road scene image to be semantically segmented shown in Fig. 3a; Fig. 3c shows the use of the method of the present invention Predict the road scene image to be semantically segmented as shown in Fig. 3a, and obtain the predicted semantic segmentation image; Fig. 4a shows another selected road scene image to be semantically segmented; Fig. 4b shows the image shown in Fig. 4a Figure 4c shows the predicted semantic segmentation image obtained by predicting the road scene image to be semantically segmented as shown in Figure 4a by using the method of the present invention. Comparing Fig. 3b and Fig. 3c, and Fig. 4b and Fig. 4c, it can be seen that the segmentation accuracy of the predicted semantic segmentation image obtained by the method of the present invention is high, which is close to the real semantic segmentation image.
Claims (2)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811293377.5A CN109635642A (en) | 2018-11-01 | 2018-11-01 | A kind of road scene dividing method based on residual error network and expansion convolution |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811293377.5A CN109635642A (en) | 2018-11-01 | 2018-11-01 | A kind of road scene dividing method based on residual error network and expansion convolution |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109635642A true CN109635642A (en) | 2019-04-16 |
Family
ID=66067090
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811293377.5A Withdrawn CN109635642A (en) | 2018-11-01 | 2018-11-01 | A kind of road scene dividing method based on residual error network and expansion convolution |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109635642A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110232721A (en) * | 2019-05-16 | 2019-09-13 | 福建自贸试验区厦门片区Manteia数据科技有限公司 | A kind of crisis organ delineates the training method and device of model automatically |
CN110276316A (en) * | 2019-06-26 | 2019-09-24 | 电子科技大学 | A human key point detection method based on deep learning |
CN110287798A (en) * | 2019-05-27 | 2019-09-27 | 魏运 | Vector network pedestrian detection method based on characteristic module and context fusion |
CN110287932A (en) * | 2019-07-02 | 2019-09-27 | 中国科学院遥感与数字地球研究所 | Road Blocking Information Extraction Based on Deep Learning Image Semantic Segmentation |
CN110490082A (en) * | 2019-07-23 | 2019-11-22 | 浙江科技学院 | A kind of road scene semantic segmentation method of effective integration neural network characteristics |
CN110728682A (en) * | 2019-09-09 | 2020-01-24 | 浙江科技学院 | A Semantic Segmentation Method Based on Residual Pyramid Pooling Neural Network |
CN110782462A (en) * | 2019-10-30 | 2020-02-11 | 浙江科技学院 | Semantic segmentation method based on double-flow feature fusion |
CN110782458A (en) * | 2019-10-23 | 2020-02-11 | 浙江科技学院 | Object image 3D semantic prediction segmentation method of asymmetric coding network |
CN110991415A (en) * | 2019-12-21 | 2020-04-10 | 武汉中海庭数据技术有限公司 | Structural target high-precision segmentation method, electronic equipment and storage medium |
CN111401436A (en) * | 2020-03-13 | 2020-07-10 | 北京工商大学 | A Street View Image Segmentation Method Fusion Network and Two-Channel Attention Mechanism |
CN111507990A (en) * | 2020-04-20 | 2020-08-07 | 南京航空航天大学 | Tunnel surface defect segmentation method based on deep learning |
CN112529064A (en) * | 2020-12-03 | 2021-03-19 | 燕山大学 | Efficient real-time semantic segmentation method |
-
2018
- 2018-11-01 CN CN201811293377.5A patent/CN109635642A/en not_active Withdrawn
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110232721A (en) * | 2019-05-16 | 2019-09-13 | 福建自贸试验区厦门片区Manteia数据科技有限公司 | A kind of crisis organ delineates the training method and device of model automatically |
CN110287798A (en) * | 2019-05-27 | 2019-09-27 | 魏运 | Vector network pedestrian detection method based on characteristic module and context fusion |
CN110276316B (en) * | 2019-06-26 | 2022-05-24 | 电子科技大学 | A human keypoint detection method based on deep learning |
CN110276316A (en) * | 2019-06-26 | 2019-09-24 | 电子科技大学 | A human key point detection method based on deep learning |
CN110287932A (en) * | 2019-07-02 | 2019-09-27 | 中国科学院遥感与数字地球研究所 | Road Blocking Information Extraction Based on Deep Learning Image Semantic Segmentation |
CN110490082A (en) * | 2019-07-23 | 2019-11-22 | 浙江科技学院 | A kind of road scene semantic segmentation method of effective integration neural network characteristics |
CN110490082B (en) * | 2019-07-23 | 2022-04-05 | 浙江科技学院 | Road scene semantic segmentation method capable of effectively fusing neural network features |
CN110728682A (en) * | 2019-09-09 | 2020-01-24 | 浙江科技学院 | A Semantic Segmentation Method Based on Residual Pyramid Pooling Neural Network |
CN110728682B (en) * | 2019-09-09 | 2022-03-29 | 浙江科技学院 | Semantic segmentation method based on residual pyramid pooling neural network |
CN110782458A (en) * | 2019-10-23 | 2020-02-11 | 浙江科技学院 | Object image 3D semantic prediction segmentation method of asymmetric coding network |
CN110782458B (en) * | 2019-10-23 | 2022-05-31 | 浙江科技学院 | Object image 3D semantic prediction segmentation method of asymmetric coding network |
CN110782462A (en) * | 2019-10-30 | 2020-02-11 | 浙江科技学院 | Semantic segmentation method based on double-flow feature fusion |
CN110782462B (en) * | 2019-10-30 | 2022-08-09 | 浙江科技学院 | Semantic segmentation method based on double-flow feature fusion |
CN110991415A (en) * | 2019-12-21 | 2020-04-10 | 武汉中海庭数据技术有限公司 | Structural target high-precision segmentation method, electronic equipment and storage medium |
CN111401436A (en) * | 2020-03-13 | 2020-07-10 | 北京工商大学 | A Street View Image Segmentation Method Fusion Network and Two-Channel Attention Mechanism |
CN111401436B (en) * | 2020-03-13 | 2023-04-18 | 中国科学院地理科学与资源研究所 | Streetscape image segmentation method fusing network and two-channel attention mechanism |
CN111507990B (en) * | 2020-04-20 | 2022-02-11 | 南京航空航天大学 | Tunnel surface defect segmentation method based on deep learning |
CN111507990A (en) * | 2020-04-20 | 2020-08-07 | 南京航空航天大学 | Tunnel surface defect segmentation method based on deep learning |
CN112529064A (en) * | 2020-12-03 | 2021-03-19 | 燕山大学 | Efficient real-time semantic segmentation method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109635642A (en) | A kind of road scene dividing method based on residual error network and expansion convolution | |
CN110782462B (en) | Semantic segmentation method based on double-flow feature fusion | |
CN111062951B (en) | A Knowledge Distillation Method Based on Intra-Class Feature Difference for Semantic Segmentation | |
CN110781924B (en) | Side-scan sonar image feature extraction method based on full convolution neural network | |
CN112396607B (en) | A Deformable Convolution Fusion Enhanced Semantic Segmentation Method for Street View Images | |
CN111401436B (en) | Streetscape image segmentation method fusing network and two-channel attention mechanism | |
CN110728682B (en) | Semantic segmentation method based on residual pyramid pooling neural network | |
CN110929736B (en) | Multi-feature cascading RGB-D significance target detection method | |
CN114943963B (en) | Remote sensing image cloud and cloud shadow segmentation method based on double-branch fusion network | |
CN109635662B (en) | Road scene semantic segmentation method based on convolutional neural network | |
CN110490205B (en) | Road scene semantic segmentation method based on full-residual-error hole convolutional neural network | |
CN109146944B (en) | Visual depth estimation method based on depth separable convolutional neural network | |
CN110853057B (en) | Aerial image segmentation method based on global and multi-scale fully convolutional network | |
CN110490082A (en) | A kind of road scene semantic segmentation method of effective integration neural network characteristics | |
CN113269787A (en) | Remote sensing image semantic segmentation method based on gating fusion | |
CN112489164B (en) | Image coloring method based on improved depth separable convolutional neural network | |
CN117237559B (en) | Digital twin city-oriented three-dimensional model data intelligent analysis method and system | |
CN113192073A (en) | Clothing semantic segmentation method based on cross fusion network | |
CN109461177B (en) | Monocular image depth prediction method based on neural network | |
CN113269224A (en) | Scene image classification method, system and storage medium | |
CN109508639B (en) | Road scene semantic segmentation method based on multi-scale porous convolutional neural network | |
CN109446933B (en) | Road scene semantic segmentation method based on convolutional neural network | |
CN112365511A (en) | Point cloud segmentation method based on overlapped region retrieval and alignment | |
CN109448039B (en) | Monocular vision depth estimation method based on deep convolutional neural network | |
CN117789046A (en) | Remote sensing image change detection method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20190416 |
|
WW01 | Invention patent application withdrawn after publication |