CN112508960A - Low-precision image semantic segmentation method based on improved attention mechanism - Google Patents

Low-precision image semantic segmentation method based on improved attention mechanism Download PDF

Info

Publication number
CN112508960A
CN112508960A CN202011521916.3A CN202011521916A CN112508960A CN 112508960 A CN112508960 A CN 112508960A CN 202011521916 A CN202011521916 A CN 202011521916A CN 112508960 A CN112508960 A CN 112508960A
Authority
CN
China
Prior art keywords
attention
feature
network
low
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011521916.3A
Other languages
Chinese (zh)
Inventor
陈纯玉
吴忻生
陈安
王博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202011521916.3A priority Critical patent/CN112508960A/en
Publication of CN112508960A publication Critical patent/CN112508960A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a low-precision image semantic segmentation method based on an improved attention mechanism, which comprises the following steps of: s1, collecting image composition data sets under different scenes, and dividing the data sets into a training set, a verification set and a test set; s2, performing feature extraction on the preprocessed training set images by using an improved MobileNet v2 network, and performing up-sampling or down-sampling on the resolution of feature images of different layers; s3, aggregating the multi-scale information of the feature graph after up-sampling or down-sampling in S2 by using a GASPP structure with a global attention feature module; s4, fusing the low-level detail features extracted by the MobileNet v2 main network with the multi-scale features obtained by the polymerization in the step S3, and fusing the obtained fusion features; and S5, decoding the feature map through bilinear interpolation upsampling to obtain a final segmentation image.

Description

Low-precision image semantic segmentation method based on improved attention mechanism
Technical Field
The invention belongs to the field of deep learning and computer vision, and particularly relates to a low-precision image semantic segmentation method based on an improved attention mechanism.
Background
Since the 21 st century, how to realize intelligent driving becomes an increasingly topic of people. In the common scene facing intelligent vehicles, the semantic segmentation technology is a key technology for identifying different objects such as obstacles, driving areas, traffic lights and the like in urban roads. Semantic segmentation is a classification at the pixel level, and pixels belonging to the same class are classified into one class, so that the semantic segmentation is to understand an image from the pixel level.
Before the deep learning method using the convolutional neural network becomes mainstream, semantic segmentation methods such as textonfiest and a random forest classifier are used in many cases. The methods are simple in design and easy to implement, but the feature extraction link is mainly realized manually, and the classification effect is poor.
The deep learning method has great success in semantic segmentation, and the deep learning method can be generalized into several ideas for solving the problem of semantic segmentation.
In 2014, a Full Convolution Network (FCN) is generated, the FCN replaces a network full connection layer with convolution, so that input of any image size becomes possible, firstly, an RGB image is input into a convolution neural network, a series of feature maps are obtained through multiple convolution and pooling processes, then, an inverse convolution layer is utilized to perform up-sampling on the feature map obtained by the last convolution layer, the feature map after up-sampling is the same as the original image in size, therefore, the spatial position information of each pixel value on the feature map in the original image is reserved while prediction is performed on each pixel value on the feature map, finally, pixel-by-pixel classification is performed on the up-sampled feature map, and softmax classification loss is calculated pixel-by-pixel.
The encoder-decoder is an FCN-based fabric. encoder gradually reduces spatial dimensions due to posing, while decoder gradually restores spatial dimensions and detailed information. There is also a shortcut connection (i.e., a connection across layers) from encoder to decoder in general.
A scaled/atomic architecture, which replaces posing, on the one hand it preserves spatial resolution and on the other hand it integrates context information well because it enlarges the field of view.
There is also a method of post-processing the segmentation results, namely Conditional Random Fields (CRFs), to improve the segmentation. The DeepLab series articles basically adopt the post-processing method, and can better improve the segmentation result.
The existing networks such as U-Net networks, VGG networks and the like have the problems of insufficient real-time performance and the like, and the lightweight networks such as MoblieNet series and the like have the problems of insufficient accuracy and the like. How to improve the accuracy while ensuring the real-time performance of image segmentation is an important problem to be solved by the method.
Disclosure of Invention
The invention aims to provide a low-precision image semantic segmentation method based on an improved attention mechanism, which can improve the image segmentation accuracy in a low-precision network.
The object of the invention is achieved by at least one of the following solutions.
A low-precision image semantic segmentation method based on an improved attention mechanism comprises the following steps:
s1, collecting and preprocessing images in different scenes, labeling the images to form a data set, and dividing the data set into a training set, a verification set and a test set;
s2, performing feature extraction on the preprocessed training set images by using an improved MobileNet v2 network, and performing up-sampling or down-sampling on the resolution of feature images of different layers;
s3, aggregating the multi-scale information of the feature graph after up-sampling or down-sampling in the step S2 by using a GASPP structure network with a global attention feature module;
s4, fusing the low-level detail features extracted by the MobileNet v2 network and the multi-scale features obtained by aggregation in the step S3, and fusing the obtained fused features through a decoder module (SAM) with a selective attention mechanism;
and S5, decoding the feature map through bilinear interpolation upsampling to obtain a final segmentation image.
Preferably, the modified MobileNet v2 network described in step S2 is a MobileNet v2 network with the last three layers deleted.
Preferably, the GASPP structural network with global attention feature module in step S3 includes a hole space convolution pooling pyramid (ASPP) module with hole convolution based on the deep lab v3+, the ASPP module is a global average pooling operation adopted;
each branch of the GASPP-structured network contains 256 channels and a global attention mechanism module (GAM) is introduced, 3 convolution modules of 3 × 3 are added after each branch of the hole convolution, and the original 1 × 1 convolution is retained.
Preferably, the improved MobileNet v2 network only retains one two-dimensional convolution layer and seven linear bottleneck layers, the GAM takes the last layer of feature map in the MobileNet v2 backbone network as input, and expands the size of the feature map into cxhw, where the parameters C, W, H respectively represent the number of channels, the width, and the height of the feature map, and extracts global attention masks, namely a channel number mask and size masks cxhw and HW × C, by converting mapping, and extracts the correlation between features as a normalization function spearsemax input by a dot product between the two global attention masks, where the normalization function is shown in formula (1):
sparsemaxi(z)=max(0,zi-τ(z)) (1)
wherein the attention feature map vector is z ═ z1,z2,…,zk],zkAn attention feature vector representing the kth channel, ordering vector values from small to large, with a threshold τ (z) of:
Figure BDA0002849279060000031
wherein the content of the first and second substances,
Figure BDA0002849279060000032
where k denotes the total number of channels, j denotes the current channel index, z(j)And z(k)And respectively represents the attention feature map vectors of the j-th and k-th channels, and f (z) represents the maximum value of the attention feature map vectors.
Preferably, the GASPP is calculated as follows:
Z=GAM(X)⊙P3,6(P3(X))⊙P3,12(P5(X))⊙P3,18(P7(X))⊙P1(X) (1)
wherein Z represents the output of GASPP, GAM (X) represents global attention manipulation, Pk(X) represents convolution operation with convolution kernel size k × k, which represents merging by channel, and after all feature maps are concatenated, the concatenated feature maps are passed through a 1 × 1 convolution to reduce the number of channels.
Preferably, in step S4, the fusion of the low-level features and the multi-scale features is performed using a decoder module SAM, which includes a squeeze and fire network (SENet), and the SAM performs an up-sampling operation after the selective attention calculation, and the output size is restored to the input state, and a pixel profile is obtained based thereon.
Preferably, the selective attention module in the decoder module with selective attention mechanism SAM is divided into two different branches, wherein one branch is from the multi-scale aggregation high-level feature information of the GASPP structure network with global attention feature module; the other branch is from the detail feature of the MobileNet v2 network, using a 1 × 1 convolutional layer to reduce the number of channels.
Preferably, in step S4, the decoder-fused feature map is decoded by upsampling by bilinear interpolation, which is linear calculation based on the value of a known point.
Preferably, the linearity calculation is as follows:
Figure BDA0002849279060000041
wherein the intermediate point A and the point B are respectively R1And R2The values are respectively:
Figure BDA0002849279060000042
Figure BDA0002849279060000043
wherein, the coordinate points of the four corners are Q respectively11=(x1,y1),Q12=(x1,y2),Q21=(x2,y1),Q22=(x2,y2) Are known points, the P (x, y) points are evaluated, x represents the x-axis coordinate and y represents the y-axis coordinate.
Preferably, the preprocessing process mainly comprises flipping, rotating, scaling and cropping.
Compared with the prior art, the invention has the following beneficial effects:
(1) aiming at the problem of insufficient semantic segmentation accuracy of a low-precision network, the method designs an ASPP structure GASPP with global attention information and a decoder module SAM, and effectively improves the algorithm precision.
(2) The method can effectively segment roads under various scenes and inhibit noise, consumes less time and has high accuracy for semantic segmentation of the lane pictures, has better adaptability in the environments of fuzzy lane lines, rainy days, heavy fog, large area rate and the like, and has practical significance in traffic application scenes.
Drawings
FIG. 1 is a schematic structural diagram of a low-precision image semantic segmentation method based on an improved attention mechanism according to this embodiment;
fig. 2 is a diagram of a GASPP network model structure according to the embodiment;
fig. 3 is a diagram illustrating the structure of the GAM module according to the present embodiment;
FIG. 4 is a flow chart of a decoder module with selective attention according to the present embodiment;
fig. 5 is a schematic overall flow chart of the low-precision image semantic segmentation method based on the improved attention mechanism according to the embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a low-precision image semantic segmentation method based on an improved attention mechanism, and a neural network model structure chart is shown in fig. 1 and mainly comprises a backbone network, a GASPP structure network with a global attention feature module and a decoder module with a selective attention mechanism.
As shown in fig. 5, the low-precision image semantic segmentation method based on the improved attention mechanism of the embodiment includes the following steps:
step 1, collecting images of lanes under different scenes, labeling vehicles, roads, obstacles and the like of the images respectively to form a data set, and processing the data set according to the following steps of 8: 1: the proportion of 1 is divided into a training set, a verification set and a test set, wherein the training set is used for training the deep convolutional network, the verification set is used for selecting an optimal training model, and the test set is used for testing the performance of the design model at the later stage.
The preprocessing process mainly comprises turning, rotating, scaling, clipping and the like, the operations improve the accuracy of the model, enhance the stability of the model, prevent the model from being over-fitted, and enhance the fault tolerance of the data set through the controlled scale transformation specification.
Step 2, using an improved MobileNet v2 network to perform feature extraction on the picture preprocessed by the training set, and performing up-sampling or down-sampling on the resolution of feature maps of different layers;
the improved MobileNet v2 network is obtained by improvement and simplification based on MobileNet v2, the original MobileNet v2 network structure comprises a two-dimensional convolution layer, seven linear bottleneck layers, a 1 × 1 two-dimensional convolution layer, a 7 × 7 average pooling layer, a 1 × 1 two-dimensional convolution layer and three deleted layers, the obtained network model parameters are shown in table 2, only one two-dimensional convolution layer and seven linear bottleneck layers are included, the network calculation amount is greatly reduced, and the image segmentation and feature extraction speed is higher.
TABLE 2 network model parameters
Figure BDA0002849279060000051
The GASPP structure network with global attention feature module comprises an existing ASPP (cavity space convolution pooling pyramid) module with cavity convolution based on the deep lab v3+, an ASPP module with cavity convolution based on the deep lab v3+ is a global average pooling operation, each branch of the GASPP adopted by the invention contains 256 channels, GAM (global attention mechanism module) is introduced, 3 convolution modules of 3 × 3 are added after each branch of the cavity convolution, and the original 1 × 1 convolution is reserved, and the structure is shown in fig. 2.
The calculation formula of GASPP is as follows:
Z=GAM(X)⊙P3,6(P3(X))⊙P3,12(P5(X))⊙P3,18(P7(X))⊙P1(X) (1)
wherein Z represents the output of GASPP, GAM (X) represents global attention manipulation, Pk(X) represents convolution operation, the size of the convolution kernel is at k × k, which indicates a merge by channel. After all feature maps are concatenated, the concatenated feature maps are passed through a 1 × 1 convolution to reduce the number of channels to 128.
As shown in fig. 3, the GAM takes the last layer of feature map in the improved MobileNet v2 network as input, expands the size of the feature map into cxhw, where parameters C, W, H respectively represent the number of channels, width, and height of the feature map, extracts global attention masks, namely cxhw and HW × C, by converting mapping, extracts the correlation between features as a normalization function Sparsemax input by a dot product between the two global attention masks, where Sparsemax is shown as formula (2):
sparsemaxi(z)=max(0,zi-τ(z)) (2)
wherein the attention feature map vector is z ═ z1,z2,…,zk],zkAn attention feature vector representing the k-th channel, the vector being ordered from small to large, the threshold being τ (z) and being of the order
Figure BDA0002849279060000061
Wherein the content of the first and second substances,
Figure BDA0002849279060000062
where k denotes the total number of channels, j denotes the current channel index, z(j)And z(k)And respectively represents the attention feature map vectors of the j-th and k-th channels, and f (z) represents the maximum value of the attention feature map vectors.
The feature graph generated by two branches (a backbone network and a GASPP) has different levels of information, the backbone network provides rich high-level semantic information, the GASPP mainly provides enough high-level semantic information, a coder module SAM is used for fusing low-level features and multi-scale features, the SAM is improved by SENET, the structure is shown in FIG. 4, a selective attention module in the SAM can be divided into two different branches, and one branch is from the multi-scale aggregation high-level feature information of the GASPP module; the other branch is from the detail feature of the main network, and a 1 × 1 convolutional layer is used to reduce the number of channels. Merging the fused features according to channels, connecting the merged features by using a full-play average pooling layer, performing expansion operation by using a full-connection layer and a ReLU layer, performing feature recalibration by using the full-connection layer and a Sigmoid layer, performing up-sampling operation after selective attention calculation of SAM (sample access memory), recovering the output size to the input state, and obtaining a pixel distribution map according to the output size.
And (4) decoding the selected characteristic graph through a bilinear interpolation upsampling formula (4) to obtain a final segmentation image.
Figure BDA0002849279060000071
Wherein the intermediate points A and B are R respectively1And R2Respectively as follows:
Figure BDA0002849279060000072
Figure BDA0002849279060000073
wherein, the coordinate points of the four corners are Q respectively11=(x1,y1),Q12=(x1,y2),Q21=(x2,y1),Q22=(x2,y2) Are known points, the P (x, y) points are evaluated, x represents the x-axis coordinate and y represents the y-axis coordinate.
And 3, setting and modifying parameters of the low-precision semantic segmentation network model based on the improved attention mechanism, wherein the GPU used in the method is GTX2080Ti, and the input size of the picture is set to be 512 and 1024 in consideration of the resolution problem of the picture.
In order to enlarge the data set, when training the model, firstly, the RGB channels of the input original picture are normalized by the mean and variance, and the enhancement means such as random scaling and random horizontal inversion in the range of 0.5 to 2.0 are adopted in the training process. During testing, operations such as random horizontal turning, random cutting and the like are not performed on the test image, and the image is sent into the network model after the average value is subtracted.
The network adopts the existing Poly learning rate strategy, the learning rate strategy does not fix the step length parameter, the learning rate is reduced under the reference of the initial learning rate according to the attenuation factor in each iteration, and the calculation formula is as follows:
Figure BDA0002849279060000074
wherein epoch represents the current iteration cycle in the training process, max _ epoch represents the maximum iteration cycle number, and the initial learning rate lrbaseIs set to 0.01, the exponent coefficient power is set to 0.9,
the embodiment adopts an ASPP structure GASPP with global attention information and a decoder module SAM, thereby effectively improving the algorithm precision. The method can effectively segment roads under various scenes and inhibit noise, consumes less time and has high accuracy for semantic segmentation of road pictures, has better adaptability in the environments of fuzzy roads, rainy days, heavy fog, large area rate and the like, and has practical significance in traffic application scenes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A low-precision image semantic segmentation method based on an improved attention mechanism is characterized by comprising the following steps:
s1, collecting and preprocessing images in different scenes, labeling the images to form a data set, and dividing the data set into a training set, a verification set and a test set;
s2, performing feature extraction on the preprocessed training set images by using an improved MobileNet v2 network, and performing up-sampling or down-sampling on the resolution of feature images of different layers;
s3, aggregating the multi-scale information of the feature graph after up-sampling or down-sampling in the step S2 by using a GASPP structure network with a global attention feature module;
s4, fusing the low-level detail features extracted by the MobileNet v2 network and the multi-scale features obtained by aggregation in the step S3, and fusing the obtained fused features through a decoder module (SAM) with a selective attention mechanism;
and S5, decoding the feature map through bilinear interpolation upsampling to obtain a final segmentation image.
2. The method for semantic segmentation of low-precision images based on the attention-improving mechanism of claim 1, wherein the improved MobileNet v2 network in step S2 is a MobileNet v2 network with the last three layers deleted.
3. The improved attention mechanism-based low-precision image semantic segmentation method according to claim 2, wherein the GASPP structure network with global attention feature module in step S3 includes an empty space convolution pooling pyramid (ASPP) module with empty convolution based on deep lab v3+, the ASPP module is an adopted global average pooling operation;
each branch of the GASPP-structured network contains 256 channels and a global attention mechanism module (GAM) is introduced, 3 convolution modules of 3 × 3 are added after each branch of the hole convolution, and the original 1 × 1 convolution is retained.
4. The method as claimed in claim 3, wherein the improved MobileNet v2 network only retains one two-dimensional convolutional layer and seven linear bottleneck layers, the GAM takes the last layer of feature map in the MobileNet v2 backbone network as input, and expands the size of the feature map into cxhw, wherein the parameters C, W, H respectively represent the number of channels, the width and the height of the feature map, and the global attention masks are extracted by transformation mapping, and are respectively the number of channels mask and the size masks cxhw and HW × C, and the correlation between features is extracted as the input of a normalization function spearsemax by the dot product between the two global attention masks, and the normalization function is as shown in formula (1):
sparsemaxi(z)=max(0,zi-τ(z)) (1)
wherein the attention feature map vector is z ═ z1,z2,…,zk],zkAn attention feature vector representing the kth channel, ordering vector values from small to large, with a threshold τ (z) of:
Figure FDA0002849279050000021
wherein the content of the first and second substances,
Figure FDA0002849279050000022
where k denotes the total number of channels, j denotes the current channel index, z(j)And z(k)And respectively represents the attention feature map vectors of the j-th and k-th channels, and f (z) represents the maximum value of the attention feature map vectors.
5. The method for semantically segmenting the low-precision image based on the attention-improving mechanism in accordance with claim 4, wherein the GASPP has the following formula:
Z=GAM(X)⊙P3,6(P3(X))⊙P3,12(P5(X))⊙P3,18(P7(X))⊙P1(X) (1)
wherein Z represents the output of GASPP, GAM (X) represents global attention manipulation, Pk(X) represents convolution operation with convolution kernel size k × k, which represents merging by channel, and after all feature maps are concatenated, the concatenated feature maps are passed through a 1 × 1 convolution to reduce the number of channels.
6. The method for semantic segmentation of low-precision images based on attention-improving mechanism as claimed in claim 5, wherein the step S4 is implemented by fusing low-level features and multi-scale features using a decoder module SAM, wherein the SAM comprises a squeeze and excite network (SENet), the SAM performs an up-sampling operation after the selective attention calculation is completed, the output size is restored to the input state, and a pixel distribution map is obtained according to the SAM.
7. The method according to claim 6, wherein the selective attention module in the decoder module with selective attention mechanism SAM is divided into two different branches, one branch being from the multi-scale aggregation high-level feature information of the GASPP structure network with global attention feature module; the other branch is from the detail feature of the MobileNet v2 network, using a 1 × 1 convolutional layer to reduce the number of channels.
8. The method for semantic segmentation of low-precision images based on the attention-improving mechanism as claimed in claim 7, wherein the decoder-fused feature map is decoded by bilinear interpolation upsampling in step S4, and the bilinear interpolation is performed by linear calculation according to the numerical value of the known point.
9. The method for semantically segmenting the low-precision image based on the improved attention mechanism as claimed in claim 8, wherein the linear calculation is as follows:
Figure FDA0002849279050000031
wherein the intermediate point A and the point B are respectively R1And R2The values are respectively:
Figure FDA0002849279050000032
Figure FDA0002849279050000033
wherein, the coordinate points of the four corners are Q respectively11=(x1,y1),Q12=(x1,y2),Q21=(x2,y1),Q22=(x2,y2) Are known points, the P (x, y) points are evaluated, x represents the x-axis coordinate and y represents the y-axis coordinate.
10. The method of claim 9, wherein the preprocessing process mainly comprises flipping, rotating, scaling, and cropping.
CN202011521916.3A 2020-12-21 2020-12-21 Low-precision image semantic segmentation method based on improved attention mechanism Pending CN112508960A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011521916.3A CN112508960A (en) 2020-12-21 2020-12-21 Low-precision image semantic segmentation method based on improved attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011521916.3A CN112508960A (en) 2020-12-21 2020-12-21 Low-precision image semantic segmentation method based on improved attention mechanism

Publications (1)

Publication Number Publication Date
CN112508960A true CN112508960A (en) 2021-03-16

Family

ID=74922878

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011521916.3A Pending CN112508960A (en) 2020-12-21 2020-12-21 Low-precision image semantic segmentation method based on improved attention mechanism

Country Status (1)

Country Link
CN (1) CN112508960A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076920A (en) * 2021-04-20 2021-07-06 同济大学 Intelligent fault diagnosis method based on asymmetric domain confrontation self-adaptive model
CN113095185A (en) * 2021-03-31 2021-07-09 新疆爱华盈通信息技术有限公司 Facial expression recognition method, device, equipment and storage medium
CN113361537A (en) * 2021-07-23 2021-09-07 人民网股份有限公司 Image semantic segmentation method and device based on channel attention
CN113469050A (en) * 2021-07-01 2021-10-01 安徽大学 Flame detection method based on image subdivision classification
CN113592026A (en) * 2021-08-13 2021-11-02 大连大学 Binocular vision stereo matching method based on void volume and cascade cost volume
CN113920411A (en) * 2021-10-09 2022-01-11 成都信息工程大学 Improved SOLOV 2-based campus scene image segmentation method
CN114092815A (en) * 2021-11-29 2022-02-25 自然资源部国土卫星遥感应用中心 Remote sensing intelligent extraction method for large-range photovoltaic power generation facility
CN115205300A (en) * 2022-09-19 2022-10-18 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110188817A (en) * 2019-05-28 2019-08-30 厦门大学 A kind of real-time high-performance street view image semantic segmentation method based on deep learning
CN110197182A (en) * 2019-06-11 2019-09-03 中国电子科技集团公司第五十四研究所 Remote sensing image semantic segmentation method based on contextual information and attention mechanism
CN111127493A (en) * 2019-11-12 2020-05-08 中国矿业大学 Remote sensing image semantic segmentation method based on attention multi-scale feature fusion
CN111797779A (en) * 2020-07-08 2020-10-20 兰州交通大学 Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110188817A (en) * 2019-05-28 2019-08-30 厦门大学 A kind of real-time high-performance street view image semantic segmentation method based on deep learning
CN110197182A (en) * 2019-06-11 2019-09-03 中国电子科技集团公司第五十四研究所 Remote sensing image semantic segmentation method based on contextual information and attention mechanism
CN111127493A (en) * 2019-11-12 2020-05-08 中国矿业大学 Remote sensing image semantic segmentation method based on attention multi-scale feature fusion
CN111797779A (en) * 2020-07-08 2020-10-20 兰州交通大学 Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095185A (en) * 2021-03-31 2021-07-09 新疆爱华盈通信息技术有限公司 Facial expression recognition method, device, equipment and storage medium
CN113076920A (en) * 2021-04-20 2021-07-06 同济大学 Intelligent fault diagnosis method based on asymmetric domain confrontation self-adaptive model
CN113076920B (en) * 2021-04-20 2022-06-03 同济大学 Intelligent fault diagnosis method based on asymmetric domain confrontation self-adaptive model
CN113469050A (en) * 2021-07-01 2021-10-01 安徽大学 Flame detection method based on image subdivision classification
CN113361537B (en) * 2021-07-23 2022-05-10 人民网股份有限公司 Image semantic segmentation method and device based on channel attention
CN113361537A (en) * 2021-07-23 2021-09-07 人民网股份有限公司 Image semantic segmentation method and device based on channel attention
CN113592026A (en) * 2021-08-13 2021-11-02 大连大学 Binocular vision stereo matching method based on void volume and cascade cost volume
CN113592026B (en) * 2021-08-13 2023-10-03 大连大学 Binocular vision stereo matching method based on cavity volume and cascade cost volume
CN113920411A (en) * 2021-10-09 2022-01-11 成都信息工程大学 Improved SOLOV 2-based campus scene image segmentation method
CN114092815A (en) * 2021-11-29 2022-02-25 自然资源部国土卫星遥感应用中心 Remote sensing intelligent extraction method for large-range photovoltaic power generation facility
CN114092815B (en) * 2021-11-29 2022-04-15 自然资源部国土卫星遥感应用中心 Remote sensing intelligent extraction method for large-range photovoltaic power generation facility
CN115205300A (en) * 2022-09-19 2022-10-18 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion
CN115205300B (en) * 2022-09-19 2022-12-09 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion

Similar Documents

Publication Publication Date Title
CN112508960A (en) Low-precision image semantic segmentation method based on improved attention mechanism
CN111259905B (en) Feature fusion remote sensing image semantic segmentation method based on downsampling
CN110276354B (en) High-resolution streetscape picture semantic segmentation training and real-time segmentation method
CN112396607B (en) Deformable convolution fusion enhanced street view image semantic segmentation method
CN111539887B (en) Channel attention mechanism and layered learning neural network image defogging method based on mixed convolution
CN111563909B (en) Semantic segmentation method for complex street view image
CN111462013B (en) Single-image rain removing method based on structured residual learning
CN111126359A (en) High-definition image small target detection method based on self-encoder and YOLO algorithm
CN111899169B (en) Method for segmenting network of face image based on semantic segmentation
CN110717921B (en) Full convolution neural network semantic segmentation method of improved coding and decoding structure
CN111583384A (en) Hair reconstruction method based on adaptive octree hair convolutional neural network
CN110706239A (en) Scene segmentation method fusing full convolution neural network and improved ASPP module
CN113066089B (en) Real-time image semantic segmentation method based on attention guide mechanism
CN111401379A (en) Deep L abv3plus-IRCNet image semantic segmentation algorithm based on coding and decoding structure
CN113160062A (en) Infrared image target detection method, device, equipment and storage medium
CN112819000A (en) Streetscape image semantic segmentation system, streetscape image semantic segmentation method, electronic equipment and computer readable medium
CN111652081A (en) Video semantic segmentation method based on optical flow feature fusion
CN111832453A (en) Unmanned scene real-time semantic segmentation method based on double-path deep neural network
CN116486074A (en) Medical image segmentation method based on local and global context information coding
CN112884893A (en) Cross-view-angle image generation method based on asymmetric convolutional network and attention mechanism
CN114782949B (en) Traffic scene semantic segmentation method for boundary guide context aggregation
CN113052776A (en) Unsupervised image defogging method based on multi-scale depth image prior
CN115527096A (en) Small target detection method based on improved YOLOv5
CN116503709A (en) Vehicle detection method based on improved YOLOv5 in haze weather
CN116486080A (en) Lightweight image semantic segmentation method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination