CN113033570B - Image semantic segmentation method for improving void convolution and multilevel characteristic information fusion - Google Patents

Image semantic segmentation method for improving void convolution and multilevel characteristic information fusion Download PDF

Info

Publication number
CN113033570B
CN113033570B CN202110344461.0A CN202110344461A CN113033570B CN 113033570 B CN113033570 B CN 113033570B CN 202110344461 A CN202110344461 A CN 202110344461A CN 113033570 B CN113033570 B CN 113033570B
Authority
CN
China
Prior art keywords
image
convolution
feature
output
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110344461.0A
Other languages
Chinese (zh)
Other versions
CN113033570A (en
Inventor
高世伟
张长柱
张皓
王祝萍
黄超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN202110344461.0A priority Critical patent/CN113033570B/en
Publication of CN113033570A publication Critical patent/CN113033570A/en
Application granted granted Critical
Publication of CN113033570B publication Critical patent/CN113033570B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Abstract

The invention relates to an image semantic segmentation method for improving void convolution and multilevel characteristic information fusion, which comprises the following steps of: extracting image features in a deep convolution neural network by using an improved hole convolution method; the extracted deep characteristic images and the shallow characteristic images are cascaded and fused to make up for the loss of spatial information; learning boundary information of the characteristic image subjected to multistage processing through boundary thinning, fusing and restoring to the resolution of the original image, and generating a prediction segmentation graph; and (4) training the network by using a cross entropy loss function, and evaluating the model performance by using mIoU. The invention improves the utilization method of the prior cavity convolution and designs the deformable space pyramid structure, thereby improving the image characteristic extraction effect of the model. Meanwhile, a multi-level characteristic information fusion structure is designed for image resolution recovery, local information and global information contained in different levels are fully utilized, boundary refinement is introduced, and the accuracy of image semantic segmentation is effectively improved.

Description

Image semantic segmentation method for improving fusion of void volume and multilevel characteristic information
Technical Field
The invention relates to the field of computer vision and pattern recognition intelligent systems, in particular to an image semantic segmentation method for improving void convolution and multilevel characteristic information fusion.
Background
Automated scene understanding is an important goal in the field of modern computer vision. Image semantic segmentation is a basic scene understanding task of computer vision, involving taking raw data (e.g., a flat image) as input and converting it into a mask with highlighted regions of interest to divide them into multiple regions with different semantic information. In recent years, due to the excellent performance of the deep convolutional neural network in the semantic segmentation task, compared with the traditional methods such as GrabCT, N-Cut and the like, the segmentation quality is remarkably improved. Good segmentation algorithms are crucial for many practical applications, such as autopilot, medical image processing, computational photography, image search engines, augmented reality. These applications all require very accurate pixel prediction.
However, in the current semantic segmentation method based on the deep convolutional neural network, problems such as reduced image resolution, loss of global context information and the like are caused by multiple times of pooling and downsampling, and a higher prediction classification accuracy cannot be obtained on a segmentation result.
Disclosure of Invention
The invention aims to provide an image semantic segmentation method for improving the fusion of void volume and multilevel feature information, which can effectively improve the information utilization rate and effectiveness of feature extraction, enrich shallow semantic information, learn the global context information of an image and improve the accuracy of semantic segmentation on a two-dimensional image.
The structure based on the improved hole convolution method and the multi-level characteristic information fusion can ensure that the calculated amount of the system is not obviously improved while the image segmentation effect is improved. Compared with the simple method of stacking convolution networks, the method has the advantages that a more appropriate structure and method is designed for image feature extraction and spatial information compensation, the loss of feature information in the downsampling process is reduced, the pixel prediction accuracy is effectively improved, and the image semantic segmentation effect is enhanced.
In order to achieve the purpose, the invention adopts the technical scheme that:
an image semantic segmentation method for improving void convolution and multilevel characteristic information fusion comprises the following steps:
s1: extracting image features in a deep convolution neural network by using an improved hole convolution method;
s2: the extracted deep characteristic images and the shallow characteristic images are cascaded and fused to make up for the loss of spatial information; (ii) a
S3: learning boundary information of the characteristic image after multi-stage processing through boundary thinning, fusing and restoring to the resolution of the original image to generate a prediction segmentation graph;
s4: and training the network by using a cross entropy loss function, and evaluating the performance of the model by using the mIoU.
The specific implementation method of the S1 comprises the following steps:
s1.1: the method comprises the steps that ResNet-101 is used as a basic network, an improved step connection cavity convolution module is connected behind a third sampling module, the module comprises three cavity convolution layers, the cavity rate of the convolution layers is changed according to the resolution of an input image, step connection is established among different convolution layers in the forward direction, the receptive field is further expanded under the condition that the image is not reduced continuously, and information loss is reduced;
s1.2: the image which is subjected to the step-by-step connection cavity convolution module is input into the improved deformable space pyramid pooling module, the advantages of the deformable convolution, such as adaptive receptive field to target scale change and flexible convergence information, are combined with the advantage that multi-scale cavity convolution standard sampling can effectively classify any region of the image, and the capability of the model for learning target deformation is improved at the cost of low model complexity;
s1.3: and different levels of feature information contained in feature images with different resolution ratios at different stages of the down-sampling process are reserved.
The specific implementation method of the S2 comprises the following steps:
s2.1: performing 1 × 1 convolution on the feature layer processed by the step connection hole convolution module, combining the feature layer with the feature image extracted from the deepest layer to make up the semantic information of the shallow feature image, and performing 1 × 1 convolution on the output feature image to serve as the layer for output;
s2.2: combining the feature map output in S2.1 with the feature image output by the previous module to make up the semantic information of the shallow feature image, and performing 1 × 1 convolution on the output feature map to serve as the layer output;
s2.3: performing bilinear interpolation double upsampling on the feature image output in S2.2, combining the feature image with the feature image output by a previous module, making up semantic information of the shallow feature image, and performing 1 × 1 convolution on the output feature image to output the output feature image as the layer;
s2.4: and (4) performing bilinear interpolation double upsampling on the feature image output in the S2.3, combining the feature image with the feature image output by a previous module, making up semantic information of the shallow feature image, and performing 1 × 1 convolution on the output feature image to obtain the output feature image as the layer.
The specific implementation method of the S3 comprises the following steps:
s3.1: four times up-sampling the deepest output characteristic image by a bilinear interpolation method;
s3.2: the output characteristic image of the layer in the S2.1 is up-sampled four times by a bilinear interpolation method;
s3.3: performing four-time upsampling on the layer of output characteristic image in the S2.2 by a bilinear interpolation method;
s3.4: the output characteristic image of the layer in S2.3 is up-sampled twice by a bilinear interpolation method;
s3.5: and refining the boundary of the output characteristic images in S2.4, S3.1, S3.2, S3.3 and S3.4 by a BR module, fusing, performing two-step processing of 3 x 3 convolution and four-time up-sampling by a bilinear interpolation method, and recovering to the original resolution of the images to obtain the final prediction segmentation image.
The specific implementation method of the S4 comprises the following steps:
s4.1: calculating the cross entropy loss of the segmentation prediction graph and a standard segmentation graph in the data set, updating parameter weight in the model by using a back propagation algorithm, and training the data set to obtain a final semantic segmentation model;
s4.2: and (5) predicting the performance by using the mIoU index test model by using the test set in the data set.
Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages and effects:
the method fully considers the benefits and the disadvantages of the hole convolution on semantic segmentation, improves the utilization method of the existing hole convolution, designs the deformable space pyramid structure and improves the image feature extraction effect of the model. Meanwhile, compared with a common up-sampling method, a multi-level characteristic information fusion structure is designed for recovering the image resolution, local information and global information contained in different levels are fully utilized, boundary refinement is introduced, and the accuracy of image semantic segmentation is effectively improved.
Drawings
FIG. 1 is a flow chart of the overall semantic segmentation method proposed by the present invention;
FIG. 2 is a network model diagram of the overall semantic segmentation algorithm proposed by the present invention;
FIG. 3 is a block diagram of a jump connection hole convolution module in a network architecture according to the present invention;
FIG. 4 is a graph of the visualization effect of the algorithm of the present invention on the Cityscapes data set.
Detailed Description
The invention is further described below with reference to the accompanying drawings and embodiments:
an image semantic segmentation method for improving void convolution and multilevel characteristic information fusion comprises the following steps: as shown in fig. 1, comprises the following steps:
s1: image features are extracted in a deep convolutional neural network using a modified hole convolution method, as shown in fig. 2 within the dashed box "S1":
s1.1: firstly, a ResNet-101-based network is used as a base network, and an improved jump connection hole Convolution module is accessed after a third sampling module, wherein "Conv" represents "Convolution" and represents a Convolution layer. Fig. 3 shows a specific structure of the module, which includes three consecutive void convolution layers, the void rate (rate) of the convolution layer in the module is changed according to the resolution of the input image, the void rates of the three layers of void convolution in fig. 3 are sequentially 2, 4 and 8, step-by-step connections are established between different convolution layers in the forward direction, the receptive field is further enlarged without continuously reducing the image, and the information loss is reduced;
s1.2: the image which is subjected to the step-by-step connection cavity convolution module is input into an improved deformable space pyramid pooling module which consists of three layers of cavity convolution, one layer of deformable convolution and a maximum pooling layer in the image 2, the advantage that the deformable convolution enables the receptive field to adapt to the scale change of the target and flexibly converges information is combined with the advantage that multi-scale cavity convolution standard sampling can effectively classify any region of the image, and the capability of the model for learning the deformation of the target is improved at the cost of lower model complexity;
s1.3: and different levels of feature information contained in the feature images with different resolutions at different stages of the down-sampling process are reserved.
S2: the extracted deep characteristic image and the shallow characteristic image are cascaded and fused to make up for the loss of spatial information, as shown in a dashed line frame of 'S2' in FIG. 2;
s2.1: as shown in fig. 2, the feature layer processed by the step-wise connected hole convolution module is subjected to 1 × 1 convolution, and is combined with the feature image extracted from the deepest layer of the network model, wherein "C" in the figure represents "corresponding", which means the fusion of feature images of different levels, and is used for compensating the semantic information of the shallow feature image, and the output feature image is output as the layer after being subjected to 1 × 1 convolution;
s2.2: combining the feature map output in S2.1 with the feature image output by the previous module to make up the semantic information of the shallow feature image, and performing 1 × 1 convolution on the output feature map to serve as the layer output;
s2.3: combining the feature image output in S2.2 with the feature image output by the previous module through bilinear interpolation double upsampling (namely 'upsample by 2'), making up semantic information of the shallow feature image, and performing 1 × 1 convolution on the output feature image to serve as the layer to be output;
s2.4: and (4) performing bilinear interpolation double upsampling on the feature image output in the S2.3, combining the feature image with the feature image output by a previous module, making up the semantic information of the shallow feature image, and performing 1 × 1 convolution on the output feature image to obtain the output feature image.
S3: learning boundary information of the multi-stage processed characteristic image through boundary thinning, fusing and recovering to the resolution of an original image to generate a prediction segmentation map, wherein the prediction segmentation map is shown in a dotted line frame of 'S3' in figure 2;
s3.1: four times up-sampling the deepest output characteristic image by a bilinear interpolation method;
s3.2: the output characteristic image of the layer in the S2.1 is up-sampled four times by a bilinear interpolation method;
s3.3: performing four-time upsampling on the layer of output characteristic image in the S2.2 by a bilinear interpolation method;
s3.4: the output characteristic image of the layer in S2.3 is up-sampled twice by a bilinear interpolation method;
s3.5: and refining the Boundary of the output characteristic images in S2.4, S3.1, S3.2, S3.3 and S3.4 by a BR (Boundary refining) module, fusing, performing four-time upsampling by a 3 x 3 convolution and bilinear interpolation method, and recovering to the original resolution of the image to obtain a final prediction segmentation map.
S4: and (4) training the network by using a cross entropy loss function, and evaluating the model performance by using mIoU.
S4.1: calculating the cross entropy loss of the segmentation prediction graph and the standard segmentation graph in the data set, updating the parameter weight in the model by using a back propagation algorithm, and training by a training set in the data set to obtain a final semantic segmentation model;
s4.2: and (5) predicting the performance by using a test set in the data set with accuracy and an mIoU test model.
The following experiments were conducted in accordance with the method of the present invention to illustrate the predicted effects of the present invention.
And (3) testing environment: ubuntu16.04 system; NVIDIA GTX 1080Ti GPU; python3.5; tensorFlow framework.
Testing a data set: the selected dataset is an image dataset PASCAL VOC 2012 for image segmentation in computer vision tasks, relating to four categories: vehicle, family, animal, human, and further subdivided into 20 sub-categories (plus a background). The data set contained 1464 training images, 1449 validation images, and 1456 test images.
And (3) testing indexes: the invention uses mIoU as a performance evaluation index. mlou refers to the ratio of the intersection of the predicted and actual regions to the union of the predicted and actual regions. The result comparison is carried out on the index data calculated by different algorithms in the prior art, and the better result obtained in the field of image semantic segmentation is proved.
The test results were as follows:
table 1 shows the performance comparison of the method under different cavity convolution cavity rates of the deformable space pyramid pooling module design, and the proper parameter setting can improve the network performance through comparison
Figure BDA0002996698810000051
Table 2. The performance comparison of the invention under the addition of the multi-level feature information fusion and boundary refinement modules can prove the effectiveness of the network design
Figure BDA0002996698810000061
TABLE 3 comparison of Performance of the present invention with other algorithms under the PASCAL VOC 2012 data set
Figure BDA0002996698810000062
As can be seen from the comparison data, the mIoU of the invention is obviously improved compared with the existing algorithm.
It is emphasized that the examples described herein are illustrative and are intended to enable one of ordinary skill in the art to understand the disclosure for practice of the invention, including but not limited to the examples described in the detailed description. All equivalent changes or modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims (4)

1. An image semantic segmentation method for improving the fusion of hollow convolution and multilevel characteristic information is characterized in that: the method comprises the following steps:
s1: extracting image features in a deep convolutional neural network by using an improved hole convolution method;
s2: the extracted deep characteristic images and the shallow characteristic images are cascaded and fused to make up for the loss of spatial information;
s3: learning boundary information of the characteristic image after multi-stage processing through boundary thinning, fusing and restoring to the resolution of the original image to generate a prediction segmentation graph;
s4: training a network by using a cross entropy loss function, and evaluating the performance of the model by using mIoU;
the specific implementation method of the S1 comprises the following steps:
s1.1, taking ResNet-101 as a basic network, accessing an improved step-connection hole convolution module after a third sampling module, wherein the module comprises three continuous hole convolution layers, changing the hole rate of the convolution layers according to the resolution of an input image, establishing step-connection among different convolution layers in a forward direction, further expanding the receptive field under the condition of not continuously reducing the image and reducing the information loss;
s1.2, inputting the image subjected to the leap connection cavity convolution module into an improved deformable space pyramid pooling module, combining the advantages of adaptive receptive field target scale change and flexible convergence information of the deformable convolution and the advantage of effectively classifying any region of the image of multi-scale cavity convolution standard sampling by utilizing the deformable convolution, and improving the capability of learning the target deformation of the model at the cost of low model complexity;
s1.3, different levels of feature information contained in feature images with different resolution ratios at different stages in the downsampling process are reserved.
2. The method for improving semantic segmentation of the image with fusion of the hole convolution and the multi-level feature information as claimed in claim 1, wherein: the specific implementation method of the S2 comprises the following steps:
s2.1, performing 1 × 1 convolution on the feature layer processed by the leap connection hole convolution module, combining the feature layer with the feature image extracted from the deepest layer, making up semantic information of the shallow feature image, and performing 1 × 1 convolution on the output feature image to serve as the layer to be output;
s2.2, combining the feature map output in the S2.1 with the feature image output by the previous module, making up the semantic information of the shallow feature image, and performing 1 × 1 convolution on the output feature map to serve as the layer output;
s2.3, performing bilinear interpolation double upsampling on the feature image output in the S2.2, combining the feature image with the feature image output by a previous module, making up semantic information of the shallow feature image, and performing 1 × 1 convolution on the output feature image to obtain the output feature image;
and S2.4, performing bilinear interpolation double upsampling on the feature image output in the S2.3, combining the feature image with the feature image output by the previous module, making up the semantic information of the shallow feature image, and performing 1 × 1 convolution on the output feature image to obtain the output feature image.
3. The method for improving semantic segmentation of the image with fusion of the hole convolution and the multi-level feature information as claimed in claim 1, wherein: in order to fuse feature images with different levels and different spatial information and semantic information at the same resolution, it is necessary to uniformly process an output feature image of multiple levels to a quarter size of an original resolution of the image through a bilinear interpolation method, and the specific implementation method of S3 includes the following steps:
s3.1, four times up-sampling the deepest output feature image by a bilinear interpolation method;
s3.2, the output characteristic image of the layer in the S2.1 is subjected to four-time up-sampling through a bilinear interpolation method;
s3.3, the output characteristic image of the layer in the S2.2 is subjected to four-time up-sampling through a bilinear interpolation method;
s3.4, performing up-sampling on the layer of output characteristic image in the S2.3 by twice through a bilinear interpolation method;
and S3.5, refining the boundary of the output characteristic images in S2.4, S3.1, S3.2, S3.3 and S3.4 by a BR module, fusing, performing up-sampling two steps of processing by 3 x 3 convolution and a bilinear interpolation method, and recovering to the original resolution of the images to obtain the final prediction segmentation image.
4. The method for improving semantic segmentation of the image with fusion of the hole convolution and the multi-level feature information as claimed in claim 1, wherein: the specific implementation method of the S4 comprises the following steps:
s4.1, calculating the cross entropy loss of the segmentation prediction graph and the standard segmentation graph in the data set, updating the parameter weight in the model by using a back propagation algorithm, and training by a training set in the data set to obtain a final semantic segmentation model;
and S4.2, predicting the performance by using the mIoU index test model by using the data set.
CN202110344461.0A 2021-03-29 2021-03-29 Image semantic segmentation method for improving void convolution and multilevel characteristic information fusion Active CN113033570B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110344461.0A CN113033570B (en) 2021-03-29 2021-03-29 Image semantic segmentation method for improving void convolution and multilevel characteristic information fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110344461.0A CN113033570B (en) 2021-03-29 2021-03-29 Image semantic segmentation method for improving void convolution and multilevel characteristic information fusion

Publications (2)

Publication Number Publication Date
CN113033570A CN113033570A (en) 2021-06-25
CN113033570B true CN113033570B (en) 2022-11-11

Family

ID=76452856

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110344461.0A Active CN113033570B (en) 2021-03-29 2021-03-29 Image semantic segmentation method for improving void convolution and multilevel characteristic information fusion

Country Status (1)

Country Link
CN (1) CN113033570B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113506310B (en) * 2021-07-16 2022-03-01 首都医科大学附属北京天坛医院 Medical image processing method and device, electronic equipment and storage medium
CN113658200B (en) * 2021-07-29 2024-01-02 东北大学 Edge perception image semantic segmentation method based on self-adaptive feature fusion
CN113762476B (en) * 2021-09-08 2023-12-19 中科院成都信息技术股份有限公司 Neural network model for text detection and text detection method thereof
CN113762396A (en) * 2021-09-10 2021-12-07 西南科技大学 Two-dimensional image semantic segmentation method
CN113920099B (en) * 2021-10-15 2022-08-30 深圳大学 Polyp segmentation method based on non-local information extraction and related components
CN113936006A (en) * 2021-10-29 2022-01-14 天津大学 Segmentation method and device for processing high-noise low-quality medical image
CN114419449B (en) * 2022-03-28 2022-06-24 成都信息工程大学 Self-attention multi-scale feature fusion remote sensing image semantic segmentation method
CN115829980B (en) * 2022-12-13 2023-07-25 深圳核韬科技有限公司 Image recognition method, device and equipment for fundus photo and storage medium
CN117211758B (en) * 2023-11-07 2024-04-02 克拉玛依市远山石油科技有限公司 Intelligent drilling control system and method for shallow hole coring

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108876793A (en) * 2018-04-13 2018-11-23 北京迈格威科技有限公司 Semantic segmentation methods, devices and systems and storage medium
CN109190752A (en) * 2018-07-27 2019-01-11 国家新闻出版广电总局广播科学研究院 The image, semantic dividing method of global characteristics and local feature based on deep learning
CN109190626A (en) * 2018-07-27 2019-01-11 国家新闻出版广电总局广播科学研究院 A kind of semantic segmentation method of the multipath Fusion Features based on deep learning
CN109325534A (en) * 2018-09-22 2019-02-12 天津大学 A kind of semantic segmentation method based on two-way multi-Scale Pyramid
CN109461157A (en) * 2018-10-19 2019-03-12 苏州大学 Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN110232394A (en) * 2018-03-06 2019-09-13 华南理工大学 A kind of multi-scale image semantic segmentation method
CN110706239A (en) * 2019-09-26 2020-01-17 哈尔滨工程大学 Scene segmentation method fusing full convolution neural network and improved ASPP module
CN111242138A (en) * 2020-01-11 2020-06-05 杭州电子科技大学 RGBD significance detection method based on multi-scale feature fusion
CN112446890A (en) * 2020-10-14 2021-03-05 浙江工业大学 Melanoma segmentation method based on void convolution and multi-scale fusion

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711413B (en) * 2018-12-30 2023-04-07 陕西师范大学 Image semantic segmentation method based on deep learning
CN110188817B (en) * 2019-05-28 2021-02-26 厦门大学 Real-time high-performance street view image semantic segmentation method based on deep learning
CN111369563B (en) * 2020-02-21 2023-04-07 华南理工大学 Semantic segmentation method based on pyramid void convolutional network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110232394A (en) * 2018-03-06 2019-09-13 华南理工大学 A kind of multi-scale image semantic segmentation method
CN108876793A (en) * 2018-04-13 2018-11-23 北京迈格威科技有限公司 Semantic segmentation methods, devices and systems and storage medium
CN109190752A (en) * 2018-07-27 2019-01-11 国家新闻出版广电总局广播科学研究院 The image, semantic dividing method of global characteristics and local feature based on deep learning
CN109190626A (en) * 2018-07-27 2019-01-11 国家新闻出版广电总局广播科学研究院 A kind of semantic segmentation method of the multipath Fusion Features based on deep learning
CN109325534A (en) * 2018-09-22 2019-02-12 天津大学 A kind of semantic segmentation method based on two-way multi-Scale Pyramid
CN109461157A (en) * 2018-10-19 2019-03-12 苏州大学 Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN110706239A (en) * 2019-09-26 2020-01-17 哈尔滨工程大学 Scene segmentation method fusing full convolution neural network and improved ASPP module
CN111242138A (en) * 2020-01-11 2020-06-05 杭州电子科技大学 RGBD significance detection method based on multi-scale feature fusion
CN112446890A (en) * 2020-10-14 2021-03-05 浙江工业大学 Melanoma segmentation method based on void convolution and multi-scale fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DeepStrip: High Resolution Boundary Refinement;Peng Zhou 等;《arXiv》;20200325;全文 *
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation;Sachin Mehta 等;《SpringerLink》;20181231;全文 *
Rethinking Atrous Convolution for Semantic Image Segmentation;Liang-Chieh Chen 等;《arXiv》;20171205;全文 *

Also Published As

Publication number Publication date
CN113033570A (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN113033570B (en) Image semantic segmentation method for improving void convolution and multilevel characteristic information fusion
CN109461157B (en) Image semantic segmentation method based on multistage feature fusion and Gaussian conditional random field
CN112329658B (en) Detection algorithm improvement method for YOLOV3 network
CN110136062B (en) Super-resolution reconstruction method combining semantic segmentation
CN113469094A (en) Multi-mode remote sensing data depth fusion-based earth surface coverage classification method
CN112396607A (en) Streetscape image semantic segmentation method for deformable convolution fusion enhancement
CN112381097A (en) Scene semantic segmentation method based on deep learning
CN112435282A (en) Real-time binocular stereo matching method based on self-adaptive candidate parallax prediction network
CN112232134B (en) Human body posture estimation method based on hourglass network and attention mechanism
CN113870335A (en) Monocular depth estimation method based on multi-scale feature fusion
CN112329801B (en) Convolutional neural network non-local information construction method
CN113743269B (en) Method for recognizing human body gesture of video in lightweight manner
CN112131959A (en) 2D human body posture estimation method based on multi-scale feature reinforcement
CN113240683B (en) Attention mechanism-based lightweight semantic segmentation model construction method
CN111768415A (en) Image instance segmentation method without quantization pooling
CN113516133A (en) Multi-modal image classification method and system
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
CN116563682A (en) Attention scheme and strip convolution semantic line detection method based on depth Hough network
CN115410030A (en) Target detection method, target detection device, computer equipment and storage medium
CN113066089A (en) Real-time image semantic segmentation network based on attention guide mechanism
CN114494284B (en) Scene analysis model and method based on explicit supervision area relation
CN117011655A (en) Adaptive region selection feature fusion based method, target tracking method and system
CN112232102A (en) Building target identification method and system based on deep neural network and multitask learning
CN113450364B (en) Tree-shaped structure center line extraction method based on three-dimensional flux model
CN116978057A (en) Human body posture migration method and device in image, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant