CN112508960A - Low-precision image semantic segmentation method based on improved attention mechanism - Google Patents
Low-precision image semantic segmentation method based on improved attention mechanism Download PDFInfo
- Publication number
- CN112508960A CN112508960A CN202011521916.3A CN202011521916A CN112508960A CN 112508960 A CN112508960 A CN 112508960A CN 202011521916 A CN202011521916 A CN 202011521916A CN 112508960 A CN112508960 A CN 112508960A
- Authority
- CN
- China
- Prior art keywords
- attention
- feature
- network
- low
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 230000011218 segmentation Effects 0.000 title claims abstract description 32
- 230000007246 mechanism Effects 0.000 title claims abstract description 26
- 238000005070 sampling Methods 0.000 claims abstract description 19
- 238000012549 training Methods 0.000 claims abstract description 14
- 238000012360 testing method Methods 0.000 claims abstract description 8
- 238000000605 extraction Methods 0.000 claims abstract description 6
- 238000012795 verification Methods 0.000 claims abstract description 5
- 230000004931 aggregating effect Effects 0.000 claims abstract description 3
- 230000003935 attention Effects 0.000 claims description 46
- 239000013598 vector Substances 0.000 claims description 15
- 230000010332 selective attention Effects 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 10
- 238000011176 pooling Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 6
- 230000002776 aggregation Effects 0.000 claims description 5
- 238000004220 aggregation Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000002372 labelling Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 230000000717 retained effect Effects 0.000 claims description 2
- 230000009466 transformation Effects 0.000 claims description 2
- 230000004927 fusion Effects 0.000 abstract description 2
- 238000006116 polymerization reaction Methods 0.000 abstract 1
- 238000013135 deep learning Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000003709 image segmentation Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000005034 decoration Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a low-precision image semantic segmentation method based on an improved attention mechanism, which comprises the following steps of: s1, collecting image composition data sets under different scenes, and dividing the data sets into a training set, a verification set and a test set; s2, performing feature extraction on the preprocessed training set images by using an improved MobileNet v2 network, and performing up-sampling or down-sampling on the resolution of feature images of different layers; s3, aggregating the multi-scale information of the feature graph after up-sampling or down-sampling in S2 by using a GASPP structure with a global attention feature module; s4, fusing the low-level detail features extracted by the MobileNet v2 main network with the multi-scale features obtained by the polymerization in the step S3, and fusing the obtained fusion features; and S5, decoding the feature map through bilinear interpolation upsampling to obtain a final segmentation image.
Description
Technical Field
The invention belongs to the field of deep learning and computer vision, and particularly relates to a low-precision image semantic segmentation method based on an improved attention mechanism.
Background
Since the 21 st century, how to realize intelligent driving becomes an increasingly topic of people. In the common scene facing intelligent vehicles, the semantic segmentation technology is a key technology for identifying different objects such as obstacles, driving areas, traffic lights and the like in urban roads. Semantic segmentation is a classification at the pixel level, and pixels belonging to the same class are classified into one class, so that the semantic segmentation is to understand an image from the pixel level.
Before the deep learning method using the convolutional neural network becomes mainstream, semantic segmentation methods such as textonfiest and a random forest classifier are used in many cases. The methods are simple in design and easy to implement, but the feature extraction link is mainly realized manually, and the classification effect is poor.
The deep learning method has great success in semantic segmentation, and the deep learning method can be generalized into several ideas for solving the problem of semantic segmentation.
In 2014, a Full Convolution Network (FCN) is generated, the FCN replaces a network full connection layer with convolution, so that input of any image size becomes possible, firstly, an RGB image is input into a convolution neural network, a series of feature maps are obtained through multiple convolution and pooling processes, then, an inverse convolution layer is utilized to perform up-sampling on the feature map obtained by the last convolution layer, the feature map after up-sampling is the same as the original image in size, therefore, the spatial position information of each pixel value on the feature map in the original image is reserved while prediction is performed on each pixel value on the feature map, finally, pixel-by-pixel classification is performed on the up-sampled feature map, and softmax classification loss is calculated pixel-by-pixel.
The encoder-decoder is an FCN-based fabric. encoder gradually reduces spatial dimensions due to posing, while decoder gradually restores spatial dimensions and detailed information. There is also a shortcut connection (i.e., a connection across layers) from encoder to decoder in general.
A scaled/atomic architecture, which replaces posing, on the one hand it preserves spatial resolution and on the other hand it integrates context information well because it enlarges the field of view.
There is also a method of post-processing the segmentation results, namely Conditional Random Fields (CRFs), to improve the segmentation. The DeepLab series articles basically adopt the post-processing method, and can better improve the segmentation result.
The existing networks such as U-Net networks, VGG networks and the like have the problems of insufficient real-time performance and the like, and the lightweight networks such as MoblieNet series and the like have the problems of insufficient accuracy and the like. How to improve the accuracy while ensuring the real-time performance of image segmentation is an important problem to be solved by the method.
Disclosure of Invention
The invention aims to provide a low-precision image semantic segmentation method based on an improved attention mechanism, which can improve the image segmentation accuracy in a low-precision network.
The object of the invention is achieved by at least one of the following solutions.
A low-precision image semantic segmentation method based on an improved attention mechanism comprises the following steps:
s1, collecting and preprocessing images in different scenes, labeling the images to form a data set, and dividing the data set into a training set, a verification set and a test set;
s2, performing feature extraction on the preprocessed training set images by using an improved MobileNet v2 network, and performing up-sampling or down-sampling on the resolution of feature images of different layers;
s3, aggregating the multi-scale information of the feature graph after up-sampling or down-sampling in the step S2 by using a GASPP structure network with a global attention feature module;
s4, fusing the low-level detail features extracted by the MobileNet v2 network and the multi-scale features obtained by aggregation in the step S3, and fusing the obtained fused features through a decoder module (SAM) with a selective attention mechanism;
and S5, decoding the feature map through bilinear interpolation upsampling to obtain a final segmentation image.
Preferably, the modified MobileNet v2 network described in step S2 is a MobileNet v2 network with the last three layers deleted.
Preferably, the GASPP structural network with global attention feature module in step S3 includes a hole space convolution pooling pyramid (ASPP) module with hole convolution based on the deep lab v3+, the ASPP module is a global average pooling operation adopted;
each branch of the GASPP-structured network contains 256 channels and a global attention mechanism module (GAM) is introduced, 3 convolution modules of 3 × 3 are added after each branch of the hole convolution, and the original 1 × 1 convolution is retained.
Preferably, the improved MobileNet v2 network only retains one two-dimensional convolution layer and seven linear bottleneck layers, the GAM takes the last layer of feature map in the MobileNet v2 backbone network as input, and expands the size of the feature map into cxhw, where the parameters C, W, H respectively represent the number of channels, the width, and the height of the feature map, and extracts global attention masks, namely a channel number mask and size masks cxhw and HW × C, by converting mapping, and extracts the correlation between features as a normalization function spearsemax input by a dot product between the two global attention masks, where the normalization function is shown in formula (1):
sparsemaxi(z)=max(0,zi-τ(z)) (1)
wherein the attention feature map vector is z ═ z1,z2,…,zk],zkAn attention feature vector representing the kth channel, ordering vector values from small to large, with a threshold τ (z) of:
wherein the content of the first and second substances,
where k denotes the total number of channels, j denotes the current channel index, z(j)And z(k)And respectively represents the attention feature map vectors of the j-th and k-th channels, and f (z) represents the maximum value of the attention feature map vectors.
Preferably, the GASPP is calculated as follows:
Z=GAM(X)⊙P3,6(P3(X))⊙P3,12(P5(X))⊙P3,18(P7(X))⊙P1(X) (1)
wherein Z represents the output of GASPP, GAM (X) represents global attention manipulation, Pk(X) represents convolution operation with convolution kernel size k × k, which represents merging by channel, and after all feature maps are concatenated, the concatenated feature maps are passed through a 1 × 1 convolution to reduce the number of channels.
Preferably, in step S4, the fusion of the low-level features and the multi-scale features is performed using a decoder module SAM, which includes a squeeze and fire network (SENet), and the SAM performs an up-sampling operation after the selective attention calculation, and the output size is restored to the input state, and a pixel profile is obtained based thereon.
Preferably, the selective attention module in the decoder module with selective attention mechanism SAM is divided into two different branches, wherein one branch is from the multi-scale aggregation high-level feature information of the GASPP structure network with global attention feature module; the other branch is from the detail feature of the MobileNet v2 network, using a 1 × 1 convolutional layer to reduce the number of channels.
Preferably, in step S4, the decoder-fused feature map is decoded by upsampling by bilinear interpolation, which is linear calculation based on the value of a known point.
Preferably, the linearity calculation is as follows:
wherein the intermediate point A and the point B are respectively R1And R2The values are respectively:
wherein, the coordinate points of the four corners are Q respectively11=(x1,y1),Q12=(x1,y2),Q21=(x2,y1),Q22=(x2,y2) Are known points, the P (x, y) points are evaluated, x represents the x-axis coordinate and y represents the y-axis coordinate.
Preferably, the preprocessing process mainly comprises flipping, rotating, scaling and cropping.
Compared with the prior art, the invention has the following beneficial effects:
(1) aiming at the problem of insufficient semantic segmentation accuracy of a low-precision network, the method designs an ASPP structure GASPP with global attention information and a decoder module SAM, and effectively improves the algorithm precision.
(2) The method can effectively segment roads under various scenes and inhibit noise, consumes less time and has high accuracy for semantic segmentation of the lane pictures, has better adaptability in the environments of fuzzy lane lines, rainy days, heavy fog, large area rate and the like, and has practical significance in traffic application scenes.
Drawings
FIG. 1 is a schematic structural diagram of a low-precision image semantic segmentation method based on an improved attention mechanism according to this embodiment;
fig. 2 is a diagram of a GASPP network model structure according to the embodiment;
fig. 3 is a diagram illustrating the structure of the GAM module according to the present embodiment;
FIG. 4 is a flow chart of a decoder module with selective attention according to the present embodiment;
fig. 5 is a schematic overall flow chart of the low-precision image semantic segmentation method based on the improved attention mechanism according to the embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a low-precision image semantic segmentation method based on an improved attention mechanism, and a neural network model structure chart is shown in fig. 1 and mainly comprises a backbone network, a GASPP structure network with a global attention feature module and a decoder module with a selective attention mechanism.
As shown in fig. 5, the low-precision image semantic segmentation method based on the improved attention mechanism of the embodiment includes the following steps:
step 1, collecting images of lanes under different scenes, labeling vehicles, roads, obstacles and the like of the images respectively to form a data set, and processing the data set according to the following steps of 8: 1: the proportion of 1 is divided into a training set, a verification set and a test set, wherein the training set is used for training the deep convolutional network, the verification set is used for selecting an optimal training model, and the test set is used for testing the performance of the design model at the later stage.
The preprocessing process mainly comprises turning, rotating, scaling, clipping and the like, the operations improve the accuracy of the model, enhance the stability of the model, prevent the model from being over-fitted, and enhance the fault tolerance of the data set through the controlled scale transformation specification.
Step 2, using an improved MobileNet v2 network to perform feature extraction on the picture preprocessed by the training set, and performing up-sampling or down-sampling on the resolution of feature maps of different layers;
the improved MobileNet v2 network is obtained by improvement and simplification based on MobileNet v2, the original MobileNet v2 network structure comprises a two-dimensional convolution layer, seven linear bottleneck layers, a 1 × 1 two-dimensional convolution layer, a 7 × 7 average pooling layer, a 1 × 1 two-dimensional convolution layer and three deleted layers, the obtained network model parameters are shown in table 2, only one two-dimensional convolution layer and seven linear bottleneck layers are included, the network calculation amount is greatly reduced, and the image segmentation and feature extraction speed is higher.
TABLE 2 network model parameters
The GASPP structure network with global attention feature module comprises an existing ASPP (cavity space convolution pooling pyramid) module with cavity convolution based on the deep lab v3+, an ASPP module with cavity convolution based on the deep lab v3+ is a global average pooling operation, each branch of the GASPP adopted by the invention contains 256 channels, GAM (global attention mechanism module) is introduced, 3 convolution modules of 3 × 3 are added after each branch of the cavity convolution, and the original 1 × 1 convolution is reserved, and the structure is shown in fig. 2.
The calculation formula of GASPP is as follows:
Z=GAM(X)⊙P3,6(P3(X))⊙P3,12(P5(X))⊙P3,18(P7(X))⊙P1(X) (1)
wherein Z represents the output of GASPP, GAM (X) represents global attention manipulation, Pk(X) represents convolution operation, the size of the convolution kernel is at k × k, which indicates a merge by channel. After all feature maps are concatenated, the concatenated feature maps are passed through a 1 × 1 convolution to reduce the number of channels to 128.
As shown in fig. 3, the GAM takes the last layer of feature map in the improved MobileNet v2 network as input, expands the size of the feature map into cxhw, where parameters C, W, H respectively represent the number of channels, width, and height of the feature map, extracts global attention masks, namely cxhw and HW × C, by converting mapping, extracts the correlation between features as a normalization function Sparsemax input by a dot product between the two global attention masks, where Sparsemax is shown as formula (2):
sparsemaxi(z)=max(0,zi-τ(z)) (2)
wherein the attention feature map vector is z ═ z1,z2,…,zk],zkAn attention feature vector representing the k-th channel, the vector being ordered from small to large, the threshold being τ (z) and being of the order
Wherein the content of the first and second substances,
where k denotes the total number of channels, j denotes the current channel index, z(j)And z(k)And respectively represents the attention feature map vectors of the j-th and k-th channels, and f (z) represents the maximum value of the attention feature map vectors.
The feature graph generated by two branches (a backbone network and a GASPP) has different levels of information, the backbone network provides rich high-level semantic information, the GASPP mainly provides enough high-level semantic information, a coder module SAM is used for fusing low-level features and multi-scale features, the SAM is improved by SENET, the structure is shown in FIG. 4, a selective attention module in the SAM can be divided into two different branches, and one branch is from the multi-scale aggregation high-level feature information of the GASPP module; the other branch is from the detail feature of the main network, and a 1 × 1 convolutional layer is used to reduce the number of channels. Merging the fused features according to channels, connecting the merged features by using a full-play average pooling layer, performing expansion operation by using a full-connection layer and a ReLU layer, performing feature recalibration by using the full-connection layer and a Sigmoid layer, performing up-sampling operation after selective attention calculation of SAM (sample access memory), recovering the output size to the input state, and obtaining a pixel distribution map according to the output size.
And (4) decoding the selected characteristic graph through a bilinear interpolation upsampling formula (4) to obtain a final segmentation image.
Wherein the intermediate points A and B are R respectively1And R2Respectively as follows:
wherein, the coordinate points of the four corners are Q respectively11=(x1,y1),Q12=(x1,y2),Q21=(x2,y1),Q22=(x2,y2) Are known points, the P (x, y) points are evaluated, x represents the x-axis coordinate and y represents the y-axis coordinate.
And 3, setting and modifying parameters of the low-precision semantic segmentation network model based on the improved attention mechanism, wherein the GPU used in the method is GTX2080Ti, and the input size of the picture is set to be 512 and 1024 in consideration of the resolution problem of the picture.
In order to enlarge the data set, when training the model, firstly, the RGB channels of the input original picture are normalized by the mean and variance, and the enhancement means such as random scaling and random horizontal inversion in the range of 0.5 to 2.0 are adopted in the training process. During testing, operations such as random horizontal turning, random cutting and the like are not performed on the test image, and the image is sent into the network model after the average value is subtracted.
The network adopts the existing Poly learning rate strategy, the learning rate strategy does not fix the step length parameter, the learning rate is reduced under the reference of the initial learning rate according to the attenuation factor in each iteration, and the calculation formula is as follows:
wherein epoch represents the current iteration cycle in the training process, max _ epoch represents the maximum iteration cycle number, and the initial learning rate lrbaseIs set to 0.01, the exponent coefficient power is set to 0.9,
the embodiment adopts an ASPP structure GASPP with global attention information and a decoder module SAM, thereby effectively improving the algorithm precision. The method can effectively segment roads under various scenes and inhibit noise, consumes less time and has high accuracy for semantic segmentation of road pictures, has better adaptability in the environments of fuzzy roads, rainy days, heavy fog, large area rate and the like, and has practical significance in traffic application scenes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (10)
1. A low-precision image semantic segmentation method based on an improved attention mechanism is characterized by comprising the following steps:
s1, collecting and preprocessing images in different scenes, labeling the images to form a data set, and dividing the data set into a training set, a verification set and a test set;
s2, performing feature extraction on the preprocessed training set images by using an improved MobileNet v2 network, and performing up-sampling or down-sampling on the resolution of feature images of different layers;
s3, aggregating the multi-scale information of the feature graph after up-sampling or down-sampling in the step S2 by using a GASPP structure network with a global attention feature module;
s4, fusing the low-level detail features extracted by the MobileNet v2 network and the multi-scale features obtained by aggregation in the step S3, and fusing the obtained fused features through a decoder module (SAM) with a selective attention mechanism;
and S5, decoding the feature map through bilinear interpolation upsampling to obtain a final segmentation image.
2. The method for semantic segmentation of low-precision images based on the attention-improving mechanism of claim 1, wherein the improved MobileNet v2 network in step S2 is a MobileNet v2 network with the last three layers deleted.
3. The improved attention mechanism-based low-precision image semantic segmentation method according to claim 2, wherein the GASPP structure network with global attention feature module in step S3 includes an empty space convolution pooling pyramid (ASPP) module with empty convolution based on deep lab v3+, the ASPP module is an adopted global average pooling operation;
each branch of the GASPP-structured network contains 256 channels and a global attention mechanism module (GAM) is introduced, 3 convolution modules of 3 × 3 are added after each branch of the hole convolution, and the original 1 × 1 convolution is retained.
4. The method as claimed in claim 3, wherein the improved MobileNet v2 network only retains one two-dimensional convolutional layer and seven linear bottleneck layers, the GAM takes the last layer of feature map in the MobileNet v2 backbone network as input, and expands the size of the feature map into cxhw, wherein the parameters C, W, H respectively represent the number of channels, the width and the height of the feature map, and the global attention masks are extracted by transformation mapping, and are respectively the number of channels mask and the size masks cxhw and HW × C, and the correlation between features is extracted as the input of a normalization function spearsemax by the dot product between the two global attention masks, and the normalization function is as shown in formula (1):
sparsemaxi(z)=max(0,zi-τ(z)) (1)
wherein the attention feature map vector is z ═ z1,z2,…,zk],zkAn attention feature vector representing the kth channel, ordering vector values from small to large, with a threshold τ (z) of:
wherein the content of the first and second substances,
where k denotes the total number of channels, j denotes the current channel index, z(j)And z(k)And respectively represents the attention feature map vectors of the j-th and k-th channels, and f (z) represents the maximum value of the attention feature map vectors.
5. The method for semantically segmenting the low-precision image based on the attention-improving mechanism in accordance with claim 4, wherein the GASPP has the following formula:
Z=GAM(X)⊙P3,6(P3(X))⊙P3,12(P5(X))⊙P3,18(P7(X))⊙P1(X) (1)
wherein Z represents the output of GASPP, GAM (X) represents global attention manipulation, Pk(X) represents convolution operation with convolution kernel size k × k, which represents merging by channel, and after all feature maps are concatenated, the concatenated feature maps are passed through a 1 × 1 convolution to reduce the number of channels.
6. The method for semantic segmentation of low-precision images based on attention-improving mechanism as claimed in claim 5, wherein the step S4 is implemented by fusing low-level features and multi-scale features using a decoder module SAM, wherein the SAM comprises a squeeze and excite network (SENet), the SAM performs an up-sampling operation after the selective attention calculation is completed, the output size is restored to the input state, and a pixel distribution map is obtained according to the SAM.
7. The method according to claim 6, wherein the selective attention module in the decoder module with selective attention mechanism SAM is divided into two different branches, one branch being from the multi-scale aggregation high-level feature information of the GASPP structure network with global attention feature module; the other branch is from the detail feature of the MobileNet v2 network, using a 1 × 1 convolutional layer to reduce the number of channels.
8. The method for semantic segmentation of low-precision images based on the attention-improving mechanism as claimed in claim 7, wherein the decoder-fused feature map is decoded by bilinear interpolation upsampling in step S4, and the bilinear interpolation is performed by linear calculation according to the numerical value of the known point.
9. The method for semantically segmenting the low-precision image based on the improved attention mechanism as claimed in claim 8, wherein the linear calculation is as follows:
wherein the intermediate point A and the point B are respectively R1And R2The values are respectively:
wherein, the coordinate points of the four corners are Q respectively11=(x1,y1),Q12=(x1,y2),Q21=(x2,y1),Q22=(x2,y2) Are known points, the P (x, y) points are evaluated, x represents the x-axis coordinate and y represents the y-axis coordinate.
10. The method of claim 9, wherein the preprocessing process mainly comprises flipping, rotating, scaling, and cropping.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011521916.3A CN112508960A (en) | 2020-12-21 | 2020-12-21 | Low-precision image semantic segmentation method based on improved attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011521916.3A CN112508960A (en) | 2020-12-21 | 2020-12-21 | Low-precision image semantic segmentation method based on improved attention mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112508960A true CN112508960A (en) | 2021-03-16 |
Family
ID=74922878
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011521916.3A Pending CN112508960A (en) | 2020-12-21 | 2020-12-21 | Low-precision image semantic segmentation method based on improved attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112508960A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113076920A (en) * | 2021-04-20 | 2021-07-06 | 同济大学 | Intelligent fault diagnosis method based on asymmetric domain confrontation self-adaptive model |
CN113095185A (en) * | 2021-03-31 | 2021-07-09 | 新疆爱华盈通信息技术有限公司 | Facial expression recognition method, device, equipment and storage medium |
CN113361537A (en) * | 2021-07-23 | 2021-09-07 | 人民网股份有限公司 | Image semantic segmentation method and device based on channel attention |
CN113469050A (en) * | 2021-07-01 | 2021-10-01 | 安徽大学 | Flame detection method based on image subdivision classification |
CN113592026A (en) * | 2021-08-13 | 2021-11-02 | 大连大学 | Binocular vision stereo matching method based on void volume and cascade cost volume |
CN113920411A (en) * | 2021-10-09 | 2022-01-11 | 成都信息工程大学 | Improved SOLOV 2-based campus scene image segmentation method |
CN114092815A (en) * | 2021-11-29 | 2022-02-25 | 自然资源部国土卫星遥感应用中心 | Remote sensing intelligent extraction method for large-range photovoltaic power generation facility |
CN115205300A (en) * | 2022-09-19 | 2022-10-18 | 华东交通大学 | Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110188817A (en) * | 2019-05-28 | 2019-08-30 | 厦门大学 | A kind of real-time high-performance street view image semantic segmentation method based on deep learning |
CN110197182A (en) * | 2019-06-11 | 2019-09-03 | 中国电子科技集团公司第五十四研究所 | Remote sensing image semantic segmentation method based on contextual information and attention mechanism |
CN111127493A (en) * | 2019-11-12 | 2020-05-08 | 中国矿业大学 | Remote sensing image semantic segmentation method based on attention multi-scale feature fusion |
CN111797779A (en) * | 2020-07-08 | 2020-10-20 | 兰州交通大学 | Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion |
-
2020
- 2020-12-21 CN CN202011521916.3A patent/CN112508960A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110188817A (en) * | 2019-05-28 | 2019-08-30 | 厦门大学 | A kind of real-time high-performance street view image semantic segmentation method based on deep learning |
CN110197182A (en) * | 2019-06-11 | 2019-09-03 | 中国电子科技集团公司第五十四研究所 | Remote sensing image semantic segmentation method based on contextual information and attention mechanism |
CN111127493A (en) * | 2019-11-12 | 2020-05-08 | 中国矿业大学 | Remote sensing image semantic segmentation method based on attention multi-scale feature fusion |
CN111797779A (en) * | 2020-07-08 | 2020-10-20 | 兰州交通大学 | Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113095185A (en) * | 2021-03-31 | 2021-07-09 | 新疆爱华盈通信息技术有限公司 | Facial expression recognition method, device, equipment and storage medium |
CN113076920A (en) * | 2021-04-20 | 2021-07-06 | 同济大学 | Intelligent fault diagnosis method based on asymmetric domain confrontation self-adaptive model |
CN113076920B (en) * | 2021-04-20 | 2022-06-03 | 同济大学 | Intelligent fault diagnosis method based on asymmetric domain confrontation self-adaptive model |
CN113469050A (en) * | 2021-07-01 | 2021-10-01 | 安徽大学 | Flame detection method based on image subdivision classification |
CN113361537B (en) * | 2021-07-23 | 2022-05-10 | 人民网股份有限公司 | Image semantic segmentation method and device based on channel attention |
CN113361537A (en) * | 2021-07-23 | 2021-09-07 | 人民网股份有限公司 | Image semantic segmentation method and device based on channel attention |
CN113592026A (en) * | 2021-08-13 | 2021-11-02 | 大连大学 | Binocular vision stereo matching method based on void volume and cascade cost volume |
CN113592026B (en) * | 2021-08-13 | 2023-10-03 | 大连大学 | Binocular vision stereo matching method based on cavity volume and cascade cost volume |
CN113920411A (en) * | 2021-10-09 | 2022-01-11 | 成都信息工程大学 | Improved SOLOV 2-based campus scene image segmentation method |
CN114092815A (en) * | 2021-11-29 | 2022-02-25 | 自然资源部国土卫星遥感应用中心 | Remote sensing intelligent extraction method for large-range photovoltaic power generation facility |
CN114092815B (en) * | 2021-11-29 | 2022-04-15 | 自然资源部国土卫星遥感应用中心 | Remote sensing intelligent extraction method for large-range photovoltaic power generation facility |
CN115205300A (en) * | 2022-09-19 | 2022-10-18 | 华东交通大学 | Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion |
CN115205300B (en) * | 2022-09-19 | 2022-12-09 | 华东交通大学 | Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112508960A (en) | Low-precision image semantic segmentation method based on improved attention mechanism | |
CN111259905B (en) | Feature fusion remote sensing image semantic segmentation method based on downsampling | |
CN110276354B (en) | High-resolution streetscape picture semantic segmentation training and real-time segmentation method | |
CN112396607B (en) | Deformable convolution fusion enhanced street view image semantic segmentation method | |
CN111539887B (en) | Channel attention mechanism and layered learning neural network image defogging method based on mixed convolution | |
CN111563909B (en) | Semantic segmentation method for complex street view image | |
CN111462013B (en) | Single-image rain removing method based on structured residual learning | |
CN111126359A (en) | High-definition image small target detection method based on self-encoder and YOLO algorithm | |
CN111899169B (en) | Method for segmenting network of face image based on semantic segmentation | |
CN110717921B (en) | Full convolution neural network semantic segmentation method of improved coding and decoding structure | |
CN111583384A (en) | Hair reconstruction method based on adaptive octree hair convolutional neural network | |
CN110706239A (en) | Scene segmentation method fusing full convolution neural network and improved ASPP module | |
CN113066089B (en) | Real-time image semantic segmentation method based on attention guide mechanism | |
CN111401379A (en) | Deep L abv3plus-IRCNet image semantic segmentation algorithm based on coding and decoding structure | |
CN113160062A (en) | Infrared image target detection method, device, equipment and storage medium | |
CN112819000A (en) | Streetscape image semantic segmentation system, streetscape image semantic segmentation method, electronic equipment and computer readable medium | |
CN111652081A (en) | Video semantic segmentation method based on optical flow feature fusion | |
CN111832453A (en) | Unmanned scene real-time semantic segmentation method based on double-path deep neural network | |
CN116486074A (en) | Medical image segmentation method based on local and global context information coding | |
CN112884893A (en) | Cross-view-angle image generation method based on asymmetric convolutional network and attention mechanism | |
CN114782949B (en) | Traffic scene semantic segmentation method for boundary guide context aggregation | |
CN113052776A (en) | Unsupervised image defogging method based on multi-scale depth image prior | |
CN115527096A (en) | Small target detection method based on improved YOLOv5 | |
CN116503709A (en) | Vehicle detection method based on improved YOLOv5 in haze weather | |
CN116486080A (en) | Lightweight image semantic segmentation method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |