CN112241959A - Attention mechanism generation semantic segmentation method based on superpixels - Google Patents

Attention mechanism generation semantic segmentation method based on superpixels Download PDF

Info

Publication number
CN112241959A
CN112241959A CN202011011881.9A CN202011011881A CN112241959A CN 112241959 A CN112241959 A CN 112241959A CN 202011011881 A CN202011011881 A CN 202011011881A CN 112241959 A CN112241959 A CN 112241959A
Authority
CN
China
Prior art keywords
pixel
attention mechanism
super
channel
pooling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011011881.9A
Other languages
Chinese (zh)
Inventor
李亮
李亚军
王凯
彭俊杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202011011881.9A priority Critical patent/CN112241959A/en
Publication of CN112241959A publication Critical patent/CN112241959A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a deep learning technology and semantic segmentation, and provides a semantic segmentation generation method with low operation cost. The technical scheme adopted by the invention is that a semantic segmentation method is generated based on an attention mechanism of the superpixel, and the original calculation similarity between each pixel and all other pixels is converted into the calculation similarity between each pixel and all other superpixels; and fusing two coding results through space attention coding and channel attention mechanism coding to finally generate semantic segmentation. The method is mainly applied to semantic segmentation occasions.

Description

Attention mechanism generation semantic segmentation method based on superpixels
Technical Field
The invention relates to a deep learning technology, in particular to a superpixel and attention mechanism in deep learning, and semantic segmentation is completed by combining the characteristics of the superpixel and the attention mechanism.
Background
Semantic segmentation is a fundamental task in computer vision and its purpose is to classify pixels in an image, assigning a class label to each pixel in the image. Computer vision has become more and more interesting in recent years for the problem of image segmentation. More and more application scenarios require accurate and efficient segmentation techniques such as autopilot, virtual reality and intelligent robotics.
The earliest successful deep learning technique applying semantic segmentation was full convolutional neural network. The method utilizes a convolutional neural network as a basic framework to extract a feature module, and utilizes a classification network model (such as a VGG-16 network) to convert the feature module into a full-convolution model: the full link layer is converted into a full convolution layer to generate dense pixel-level features, and then the high-level semantic features and the low-level semantic features are combined to generate pixel-level labels. This work is seen as a landmark improvement that illustrates how CNN (convolutional neural network) trains end-to-end on the problem of semantic segmentation. In the subsequent work, people adopt a hole convolution method and a multi-scale method to obtain context semantic information, so that the semantic segmentation accuracy is greatly improved.
In the paper Non-local Neural Networks (Non-local Neural Networks) of nakemin in 2017, a method using a self-attention mechanism is mentioned to obtain global context information. The self-attention mechanism is that the similarity between each pixel vector and other pixels is calculated, so that the context information on the global scope is introduced into the local position. The method greatly improves the precision of semantic segmentation. Meanwhile, a new problem is introduced, semantic information of each position and all other positions is calculated to generate an attention map, the calculation amount of the network is greatly increased, and therefore the problem becomes a problem to be solved by the invention.
Disclosure of Invention
In order to overcome the defects of the prior art and solve the problem of overlarge network calculation amount, the invention aims to provide a semantic generation segmentation method with low operation cost. Therefore, the technical scheme adopted by the invention is that a semantic segmentation method is generated based on the attention mechanism of the superpixel, and the similarity calculated between each original pixel and all other pixels is converted into the similarity calculated between each pixel and all other superpixels; and fusing two coding results through space attention coding and channel attention mechanism coding to finally generate semantic segmentation.
The method comprises the following specific steps:
step 1, extracting characteristics: extracting features by adopting a residual error network ResNet-101; the network has a total 101-layer network structure, wherein a convolution or pooling structure with the step length of 2 is adopted in layers 1, 2 and 7, so that the size of the finally obtained feature map is 1/8 times of that of the original image;
step 2, embedding the super pixels: generating super pixels by using a simple linear clustering algorithm slic method, embedding a super pixel layer into a residual error network ResNet network structure, pooling a characteristic graph through the super pixel layer to obtain super pixel characteristics, and embedding the pooled super pixel characteristics into an attention mechanism network;
step 3, attention mechanism: the attention mechanism is divided into a space attention mechanism and a channel attention mechanism; the spatial attention mechanism acquires global context information by calculating the similarity between each pixel vector and the feature vectors of all other positions; the channel attention mechanism acquires semantic information among channels by calculating the similarity among the channels; then, fusing the result of the space attention mechanism and the result of the channel attention mechanism to finally obtain a result of semantic segmentation;
in step 2, embedding superpixels: generating super pixels by using a simple linear clustering algorithm slic, embedding a super pixel layer into a ResNet network structure, pooling a characteristic diagram through the super pixel layer, and then obtaining a characteristic vector corresponding to each super pixel
Figure BDA0002697801160000021
The feature vector is an average pooling performed by the region corresponding to the superpixel:
Figure BDA0002697801160000022
wherein
Figure BDA0002697801160000023
Representing the kth feature vector, S, in the ith super-pixel regioniIndicating the number of pixels in the ith super-pixel region; this pooling operation is called superpixel pooling;
in the step 3:
the spatial attention mechanism is as follows:
firstly, a characteristic diagram A e R acquired through a ResNet-101 networkC×H×WThen, a is input into three 1 × 1 convolutional layers to obtain three new feature maps B, C, D. Wherein { B, C, D }. belongs to RC×H×W(ii) a Then convert them to RC×NWherein N ═ hxw; inputting B and D into the superpixel pooling layer to obtain v and theta respectively, wherein { v, theta } is belonged to RK×CK represents the number of superpixels on each map; then, a space attention moment matrix S epsilon R is calculated by applying a normalized softmax layerN×K
Figure BDA0002697801160000024
Wherein viRepresents the feature obtained by pooling the ith super pixel in equation 1, and CjRepresenting the jth pixel in the feature map;
Sijrepresenting the similarity between the jth pixel and the ith super pixel; an attention diagram S is obtained through the formula 2, and the size of S is RN×KWherein N represents the number of pixels in the feature map, and K represents the number of superpixels in the feature map, so S represents the similarity of each pixel and the pooled feature of each superpixel; meanwhile, the feature vectors are weighted according to the calculated similarity and added to the corresponding pixel positions, so that each pixel vector can obtain semantic information of all spaces through weighting:
Figure BDA0002697801160000025
where α is a weight parameter initialized to 0 and used for learning; the finally obtained output E gathers global semantic information;
the channel attention mechanism is as follows:
firstly, an initial feature map A epsilon R is outputC×H×WAs the input characteristic of the channel attention module, obtaining v e R through the superpixel pooling layerC×KThen a channel attention mechanism X ∈ R is calculated by performing a matrix multiplicationC×C
Figure BDA0002697801160000031
Wherein xijRepresenting the similarity between the ith channel and the jth channel in the ν; in addition, the calculated similarity between channels is weighted to the corresponding channel and then accumulated to each local channel, so that each channel can obtain the information of other channels, and the final output D e RC×H×W
Figure BDA0002697801160000032
And (3) fusing the space attention feature map obtained by the formula (3) and the channel attention feature map obtained by the formula (5) to finally obtain a semantic segmentation map.
The invention has the characteristics and beneficial effects that:
the invention provides an attention mechanism network based on superpixel pooling to generate semantic segmentation, and the network provided by the invention can reduce the calculated amount of the semantic segmentation and improve the speed of the semantic segmentation;
description of the drawings:
FIG. 1 is a schematic diagram of a network architecture according to the present invention.
FIG. 2 attention drawing of the present invention
In the figure, (a) a spatial attention mechanism; (b) the channel attention mechanism.
FIG. 3 is a schematic structural view of the present invention
(a) Original drawing
(b) Semantically segmented result graph
(c) Original drawing
(d) Semantically segmented result graph
Detailed Description
In order to solve the problem of excessive network calculation amount, the invention provides an attention mechanism module based on superpixel pooling. Superpixels refer to regions of pixels that are spatially adjacent and have similar color, texture, and brightness. Since the pixels within each superpixel region are similar, we use a superpixel pooling layer to average the features in the superpixel region. And pooling the number of features in each super pixel from n to 1, so that when the attention mechanism is calculated, the similarity between each pixel and all other pixels is converted into the similarity between each pixel and all other super pixels. The method can greatly reduce the complexity of the network and does not influence the accuracy of the network.
The invention provides a deep embedded network for end-to-end semantic segmentation. The contribution of the invention lies in: firstly, an attention mechanism architecture based on superpixels is provided, and an end-to-end training model is generated; then the invention is divided into two steps in the attention mechanism: 1. obtaining semantic information in each super pixel by adopting a super pixel pooling method; 2. and calculating the similarity between each pixel and the super pixel so as to achieve the aim of acquiring global information. The method comprises the following steps:
step 1:
the basic network architecture of the present invention adopts a structure of ResNet-101. An identity connection structure is adopted in the residual error network to relieve the problem of gradient disappearance in the deep neural network. ResNet-101 is a model adopting a 101-layer residual error network;
resnet-101 network architecture:
1.7×7conv,64channels,stride 2
2.3×3conv,max pool,stride 2
3.
Figure BDA0002697801160000041
4.
Figure BDA0002697801160000042
5.
Figure BDA0002697801160000043
6.
Figure BDA0002697801160000044
7.average pool,stride 2
a total of 101 layers of network structure, in which a convolution or pooling structure with step size 2 is used in layers 1, 2, and 7, the size of the resulting feature map is 1/8 times that of the original map.
Step 2:
embedding the super-pixels: the method uses a slic (simple linear clustering algorithm) method to generate the super pixels, then embeds a super pixel layer into a ResNet network structure, performs pooling on a feature map through the super pixel layer, and then obtains a feature vector corresponding to each super pixel
Figure BDA0002697801160000045
The feature vector is an average pooling performed by the region corresponding to the superpixel:
Figure BDA0002697801160000046
wherein
Figure BDA0002697801160000047
Representing the kth feature vector, S, in the ith super-pixel regioniIndicating the number of pixels in the ith super-pixel region; this pooling operation is referred to as superpixel pooling.
And step 3:
calculating a self-attention mechanism: in the conventional method, global context information is obtained by calculating the similarity between each pixel vector and the pixel vectors of all other positions. However, although the context information obtained by this method can obtain accurate results, it takes a lot of computation time and consumes much GPU memory. Our proposed superpixel-based computation method improves computational efficiency without sacrificing final accuracy.
The spatial attention mechanism is as follows:
the spatial attention mechanism encodes global context semantic information to the pixels of each location, which enhances the representation capability of the semantic information. Firstly, a characteristic diagram A e R acquired through a ResNet-101 networkC×H×WThen, a is input into three 1 × 1 convolutional layers to obtain three new feature maps B, C, D. Wherein { B, C, D }. belongs to RC×H×W. Then convert them to RC×NWherein N ═ hxw. We input B and D to the superpixel pooling layer to get v and θ, respectively, where { v, θ }. epsilon.RK×CAnd K represents the number of superpixels on each map. Then we compute a spatial attention moment map S e R by applying a softmax (normalized) layerN×K
Figure BDA0002697801160000051
Wherein viRepresents the feature obtained by pooling the ith super pixel in equation 1, and CjRepresenting the jth pixel in the feature map;
Sijrepresenting the similarity between the jth pixel and the ith super pixel. Therefore, the effect of obtaining the global context information can be achieved by calculating each pixel vector and all the super pixel vectors on the feature map, and meanwhile, the time complexity of calculation can be greatly reduced.
By the formula 2, we finally obtain an attention diagram S, wherein S is RN×KWhere N represents the number of pixels in the feature map and K represents the number of superpixels in the feature map, so S represents the pooled feature of each pixel and each superpixelThe similarity of (2); meanwhile, the feature vectors are weighted according to the calculated similarity and added to the corresponding pixel positions, so that each pixel vector can obtain semantic information of all spaces through weighting:
Figure BDA0002697801160000052
where α is a weight parameter initialized to 0 and used for learning; finally, the output E obtained by us gathers global semantic information.
The channel attention mechanism is as follows:
in semantic segmentation, channels can be regarded as responses to each class feature, and by exploring interdependencies among the channels, the characteristics of the interdependencies can be highlighted to improve the expression of semantic features. Therefore, a channel attention mechanism is constructed to acquire semantic information between channels.
First, we output an initial feature map A ∈ RC×H×WAs the input characteristic of the channel attention module, obtaining v e R through the superpixel pooling layerC×KThen a channel attention mechanism X ∈ R is calculated by performing a matrix multiplicationC ×C
Figure BDA0002697801160000053
Wherein xijRepresenting the similarity between the ith channel and the jth channel in the ν; in addition, the calculated similarity between channels is weighted to the corresponding channel and then accumulated in each local channel, so that each channel can obtain the information of other channels. The final output D ∈ RC×H×W
Figure BDA0002697801160000061
And fusing the space attention feature map obtained by the formula 3 and the channel attention feature map obtained by the formula 5 to finally obtain a semantic segmentation map.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (3)

1. A super-pixel-based attention mechanism generation semantic segmentation method is characterized in that similarity is calculated between each pixel and all other super-pixels; and fusing two coding results through space attention coding and channel attention mechanism coding to finally generate semantic segmentation.
2. The method for superpixel-based attention mechanism-generated semantic segmentation as claimed in claim 1, comprising the steps of:
step 1, extracting characteristics: extracting features by adopting a residual error network ResNet-101; the network has a total 101-layer network structure, wherein a convolution or pooling structure with the step length of 2 is adopted in layers 1, 2 and 7, so that the size of the finally obtained feature map is 1/8 times of that of the original image;
step 2, embedding the super pixels: generating super pixels by using a simple linear clustering algorithm slic method, embedding a super pixel layer into a residual error network ResNet network structure, pooling a characteristic graph through the super pixel layer to obtain super pixel characteristics, and embedding the pooled super pixel characteristics into an attention mechanism network;
step 3, attention mechanism: the attention mechanism is divided into a space attention mechanism and a channel attention mechanism; the spatial attention mechanism acquires global context information by calculating the similarity between each pixel vector and the feature vectors of all other positions; the channel attention mechanism acquires semantic information among channels by calculating the similarity among the channels; and then fusing the result of the spatial attention mechanism and the result of the channel attention mechanism to finally obtain a result of semantic segmentation.
3. The method of superpixel-based attention mechanism-generated semantic segmentation as claimed in claim 1, wherein in step 2, the embedding superpixels: generating super pixels by using a simple linear clustering algorithm slic, embedding a super pixel layer into a ResNet network structure, pooling a characteristic diagram through the super pixel layer, and then obtaining a characteristic vector corresponding to each super pixel
Figure FDA0002697801150000011
The feature vector is an average pooling performed by the region corresponding to the superpixel:
Figure FDA0002697801150000012
wherein
Figure FDA0002697801150000013
Representing the kth feature vector, S, in the ith super-pixel regioniIndicating the number of pixels in the ith super-pixel region; this pooling operation is called superpixel pooling;
in the step 3:
the spatial attention mechanism is as follows:
firstly, a characteristic diagram A e R acquired through a ResNet-101 networkC×H×WThen, a is input into three 1 × 1 convolutional layers to obtain three new feature maps B, C, D. Wherein { B, C, D }. belongs to RC×H×W(ii) a Then convert them to RC×NWherein N ═ hxw; inputting B and D into the superpixel pooling layer to obtain v and theta respectively, wherein { v, theta } is belonged to RK×CK represents the number of superpixels on each map; then, a space attention moment matrix S epsilon R is calculated by applying a normalized softmax layerN×K
Figure FDA0002697801150000014
Wherein viIndicating that the ith super pixel passes throughFormula 1 obtained after pooling, and CjRepresenting the jth pixel in the feature map;
Sijrepresenting the similarity between the jth pixel and the ith super pixel; an attention diagram S is obtained through the formula 2, and the size of S is RN×KWherein N represents the number of pixels in the feature map, and K represents the number of superpixels in the feature map, so S represents the similarity of each pixel and the pooled feature of each superpixel; meanwhile, the feature vectors are weighted according to the calculated similarity and added to the corresponding pixel positions, so that each pixel vector can obtain semantic information of all spaces through weighting:
Figure FDA0002697801150000021
where α is a weight parameter initialized to 0 and used for learning; the finally obtained output E gathers global semantic information;
the channel attention mechanism is as follows:
firstly, an initial feature map A epsilon R is outputC×H×WAs the input characteristic of the channel attention module, obtaining v e R through the superpixel pooling layerC×KThen a channel attention mechanism X ∈ R is calculated by performing a matrix multiplicationC×C
Figure FDA0002697801150000022
Wherein xijRepresenting the similarity between the ith channel and the jth channel in the ν; in addition, the calculated similarity between channels is weighted to the corresponding channel and then accumulated to each local channel, so that each channel can obtain the information of other channels, and the final output D e RC×H×W
Figure FDA0002697801150000023
And (3) fusing the space attention feature map obtained by the formula (3) and the channel attention feature map obtained by the formula (5) to finally obtain a semantic segmentation map.
CN202011011881.9A 2020-09-23 2020-09-23 Attention mechanism generation semantic segmentation method based on superpixels Pending CN112241959A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011011881.9A CN112241959A (en) 2020-09-23 2020-09-23 Attention mechanism generation semantic segmentation method based on superpixels

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011011881.9A CN112241959A (en) 2020-09-23 2020-09-23 Attention mechanism generation semantic segmentation method based on superpixels

Publications (1)

Publication Number Publication Date
CN112241959A true CN112241959A (en) 2021-01-19

Family

ID=74171258

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011011881.9A Pending CN112241959A (en) 2020-09-23 2020-09-23 Attention mechanism generation semantic segmentation method based on superpixels

Country Status (1)

Country Link
CN (1) CN112241959A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114119627A (en) * 2021-10-19 2022-03-01 北京科技大学 High-temperature alloy microstructure image segmentation method and device based on deep learning
CN116630820A (en) * 2023-05-11 2023-08-22 北京卫星信息工程研究所 Optical remote sensing data on-satellite parallel processing method and device
CN118053051A (en) * 2024-04-16 2024-05-17 南京信息工程大学 Hyperspectral remote sensing image classification method based on superpixel self-attention mechanism

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140050391A1 (en) * 2012-08-17 2014-02-20 Nec Laboratories America, Inc. Image segmentation for large-scale fine-grained recognition
CN110414377A (en) * 2019-07-09 2019-11-05 武汉科技大学 A kind of remote sensing images scene classification method based on scale attention network
CN110533045A (en) * 2019-07-31 2019-12-03 中国民航大学 A kind of luggage X-ray contraband image, semantic dividing method of combination attention mechanism
CN111160311A (en) * 2020-01-02 2020-05-15 西北工业大学 Yellow river ice semantic segmentation method based on multi-attention machine system double-flow fusion network
CN111259936A (en) * 2020-01-09 2020-06-09 北京科技大学 Image semantic segmentation method and system based on single pixel annotation
CN111626300A (en) * 2020-05-07 2020-09-04 南京邮电大学 Image semantic segmentation model and modeling method based on context perception

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140050391A1 (en) * 2012-08-17 2014-02-20 Nec Laboratories America, Inc. Image segmentation for large-scale fine-grained recognition
CN110414377A (en) * 2019-07-09 2019-11-05 武汉科技大学 A kind of remote sensing images scene classification method based on scale attention network
CN110533045A (en) * 2019-07-31 2019-12-03 中国民航大学 A kind of luggage X-ray contraband image, semantic dividing method of combination attention mechanism
CN111160311A (en) * 2020-01-02 2020-05-15 西北工业大学 Yellow river ice semantic segmentation method based on multi-attention machine system double-flow fusion network
CN111259936A (en) * 2020-01-09 2020-06-09 北京科技大学 Image semantic segmentation method and system based on single pixel annotation
CN111626300A (en) * 2020-05-07 2020-09-04 南京邮电大学 Image semantic segmentation model and modeling method based on context perception

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
J. FU ET AL.: "Dual Attention Network for Scene Segmentation", 《2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 *
KAI WANG ET AL.: "End-to-end trainable network for superpixel and image segmentation", 《PATTERN RECOGNITION LETTERS》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114119627A (en) * 2021-10-19 2022-03-01 北京科技大学 High-temperature alloy microstructure image segmentation method and device based on deep learning
CN114119627B (en) * 2021-10-19 2022-05-17 北京科技大学 High-temperature alloy microstructure image segmentation method and device based on deep learning
CN116630820A (en) * 2023-05-11 2023-08-22 北京卫星信息工程研究所 Optical remote sensing data on-satellite parallel processing method and device
CN116630820B (en) * 2023-05-11 2024-02-06 北京卫星信息工程研究所 Optical remote sensing data on-satellite parallel processing method and device
CN118053051A (en) * 2024-04-16 2024-05-17 南京信息工程大学 Hyperspectral remote sensing image classification method based on superpixel self-attention mechanism

Similar Documents

Publication Publication Date Title
CN111210443B (en) Deformable convolution mixing task cascading semantic segmentation method based on embedding balance
CN111275713B (en) Cross-domain semantic segmentation method based on countermeasure self-integration network
CN112800903B (en) Dynamic expression recognition method and system based on space-time diagram convolutional neural network
CN112241959A (en) Attention mechanism generation semantic segmentation method based on superpixels
CN111242844B (en) Image processing method, device, server and storage medium
Cai et al. DLnet with training task conversion stream for precise semantic segmentation in actual traffic scene
CN109359527B (en) Hair region extraction method and system based on neural network
Xu et al. RGB-T salient object detection via CNN feature and result saliency map fusion
CN115205672A (en) Remote sensing building semantic segmentation method and system based on multi-scale regional attention
CN111488856B (en) Multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning
US20230072445A1 (en) Self-supervised video representation learning by exploring spatiotemporal continuity
CN114780767A (en) Large-scale image retrieval method and system based on deep convolutional neural network
CN114299101A (en) Method, apparatus, device, medium, and program product for acquiring target region of image
Cao et al. An Improved YOLOv4 Lightweight Traffic Sign Detection Algorithm
CN116486112A (en) RGB-D significance target detection method based on lightweight cross-modal fusion network
CN113688864B (en) Human-object interaction relation classification method based on split attention
CN117036368A (en) Image data processing method, device, computer equipment and storage medium
CN114445618A (en) Cross-modal interaction RGB-D image salient region detection method
CN114529949A (en) Lightweight gesture recognition method based on deep learning
CN110276391B (en) Multi-person head orientation estimation method based on deep space-time conditional random field
CN113420760A (en) Handwritten Mongolian detection and identification method based on segmentation and deformation LSTM
CN113470046B (en) Drawing meaning force network segmentation method for medical image super-pixel gray texture sampling characteristics
Sun et al. Skeleton-Based Adaptive Graph Convolutional Networks for Cockpit Sign Language Classification
Wang et al. MADB-RemdNet for Few-Shot Learning in Remote Sensing Classification
Yang et al. An optimization high-resolution network for human pose recognition based on attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210119