CN113298815A - Semi-supervised remote sensing image semantic segmentation method and device and computer equipment - Google Patents
Semi-supervised remote sensing image semantic segmentation method and device and computer equipment Download PDFInfo
- Publication number
- CN113298815A CN113298815A CN202110686544.8A CN202110686544A CN113298815A CN 113298815 A CN113298815 A CN 113298815A CN 202110686544 A CN202110686544 A CN 202110686544A CN 113298815 A CN113298815 A CN 113298815A
- Authority
- CN
- China
- Prior art keywords
- remote sensing
- semantic segmentation
- sensing image
- attention
- semi
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 122
- 238000000034 method Methods 0.000 title claims abstract description 71
- 238000007499 fusion processing Methods 0.000 claims abstract description 10
- 238000012549 training Methods 0.000 claims description 34
- 239000013598 vector Substances 0.000 claims description 21
- 230000006870 function Effects 0.000 claims description 20
- 238000010586 diagram Methods 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 11
- 238000013507 mapping Methods 0.000 claims description 8
- 230000002776 aggregation Effects 0.000 claims description 7
- 238000004220 aggregation Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 7
- 238000002372 labelling Methods 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 239000004576 sand Substances 0.000 claims description 3
- 230000035945 sensitivity Effects 0.000 claims description 3
- 238000002474 experimental method Methods 0.000 description 8
- 230000007246 mechanism Effects 0.000 description 8
- 238000013527 convolutional neural network Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 238000011160 research Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 5
- 238000009826 distribution Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 238000012800 visualization Methods 0.000 description 3
- 239000011800 void material Substances 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 101100295091 Arabidopsis thaliana NUDT14 gene Proteins 0.000 description 1
- 230000003042 antagnostic effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image
- G06T3/40—Scaling the whole image or part thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Abstract
The invention discloses a semi-supervised remote sensing image semantic segmentation method, a semi-supervised remote sensing image semantic segmentation device and computer equipment, wherein the method comprises the following steps: acquiring an original remote sensing image; scaling the original remote sensing image into 3 scaled images with different sizes; respectively inputting the 3 scaled images into 3 criss-cross attention modules to obtain 3 attention feature maps; performing fusion processing on the 3 attention feature maps to obtain a multi-scale attention feature map; and inputting the multi-scale attention feature map into a deep semantic segmentation network to obtain a semantic segmentation prediction map. The semi-supervised remote sensing image semantic segmentation model based on multi-scale attention can train the whole model by using label-free data and fully utilize the global context relationship between feature maps, thereby effectively improving the edge segmentation precision between remote sensing image targets and improving the integral accuracy.
Description
Technical Field
The invention relates to the field of semantic segmentation of remote sensing images, in particular to a semi-supervised remote sensing image semantic segmentation method and device based on multi-scale attention and computer equipment.
Background
In the research of remote sensing images, semantic segmentation of the remote sensing images classifies each pixel point in the remote sensing images, and is always an important research direction in the remote sensing images. The traditional method for semantic segmentation of remote sensing images often uses a machine learning algorithm, but the classification accuracy needs to be further improved. In recent years, with the development of deep learning, a Convolutional Neural Network (CNN) having an excellent feature extraction capability has been widely used in various fields of image processing, such as scene classification and the like. Long proposes a Full Convolutional Network (FCN) to replace the fully connected layer in a CNN network with a full convolutional layer. Unlike conventional image classification methods, FCN can achieve image segmentation of any size. SegNet proposes a deconvolution structure, exploiting the characteristics of the middle layer by skipping the connections. Gangfu et al propose a multi-scale network structure that replaces the traditional convolution, with the dilated convolution increasing the receptive field without reducing spatial resolution. The void space pyramid structure (ASPP) mainly provides a plurality of void convolution branches which have different void rates to extract multi-scale features, and obviously improves the segmentation precision of a target in an image. The Deeplabv3 network is improved for many times and is the most successful network model in the deep learning semantic segmentation field at present. Its latest version, Deeplabv3+, achieves the highest accuracy over multiple public datasets. The multi-scale integration can effectively solve the problem of target segmentation. A single neural network model has multiple different sized receptive fields to accommodate multiple sizes of target segmentation. Since the full convolution network has superior performance compared with the traditional machine learning, many scholars apply CNN to the semantic segmentation of remote sensing images, and the deep convolution network plays an increasingly important role in many fields of remote sensing images. Still another proposal is two independent full convolution network branches, using segmented images and height information from optical remote sensing as inputs to the two branches. After a series of convolution operations, the predicted segmentation results of the two branches are fused. The methods can achieve ideal effects when the marking data are sufficient.
The remote sensing image semantic segmentation can be used for geographic detection, and has an important role in obtaining landmark landform information. In recent years, with the convenience of obtaining remote sensing images and the improvement of image quality, research on remote sensing images is increasing. The remote sensing image semantic segmentation needs to classify each pixel point on the feature map, so for the labeled image, each pixel point also needs to be labeled. With the improvement of the resolution of the remote sensing image acquisition, the semantic segmentation and labeling of the remote sensing image are more difficult, and the edge of the target is difficult to segment accurately. At present, most of mainstream remote sensing image semantic segmentation researches are based on a deep convolutional neural network. Li yu provides an image semantic segmentation method based on a deep convolution fusion conditional random field, shallow layer detail information and high layer semantic information are fused into a network model, meanwhile, parameters of the conditional random field are inferred to be embedded into a network framework in an iteration layer shape, the network model is built, rich detail information and context information of a remote sensing image are comprehensively utilized in the forward and reverse propagation process of model training, and end-to-end remote sensing image semantic segmentation is achieved. And the group peak provides a method based on the connection of the coding and decoding structural features, and improves the DeconvNet network model. When the model is coded, the spatial structure information can be effectively reserved by recording the position of the pooling index and applying the position to the pooling process; during decoding, the model is effectively subjected to feature extraction by using a mode of connecting encoding and decoding corresponding feature layers. The remote sensing image semantic segmentation method based on the improved Deeplabv3 is provided by the Xiong scene, and the semantic integrity of the image on the resolution is ensured by improving a single upsampling layer and performing multi-layer upsampling by using residual errors obtained in a backbone network. However, the existing remote sensing image semantic segmentation method cannot well utilize the non-labeled data, so that the segmentation effect is poor when the labeled data are less. When the remote sensing image labeling data is insufficient, how to improve the semantic segmentation effect and the space for extracting the target. The current semi-supervised segmentation method causes the problem of inaccurate segmentation of target edges in remote sensing images because long-distance correlation is not concerned.
Disclosure of Invention
Based on the above, it is necessary to provide a semi-supervised remote sensing image semantic segmentation method, device and computer equipment for solving the above technical problems.
The embodiment of the invention provides a semi-supervised remote sensing image semantic segmentation method, which comprises the following steps:
acquiring an original remote sensing image;
scaling the original remote sensing image into 3 scaled images with different sizes; respectively inputting the 3 scaled images into 3 criss-cross attention modules to obtain 3 attention feature maps; performing fusion processing on the 3 attention feature maps to obtain a multi-scale attention feature map;
and inputting the multi-scale attention feature map into a deep semantic segmentation network to obtain a semantic segmentation prediction map.
In one embodiment, the obtaining of the multi-scale attention feature map specifically includes:
inputting the original remote sensing image into a deep convolution neural network to obtain characteristic images X with different sizes1And characteristic diagram X2And characteristic diagram X3;
Will feature diagram X1Characteristic diagramX2And characteristic diagram X3Respectively inputting into 3 criss-cross attention modules to obtain attention feature map C1Attention feature chart C2Attention feature chart C3;
To attention feature C1Attention feature chart C2Attention feature chart C3And sequentially carrying out up-sampling and fusion processing to obtain a multi-scale attention feature map.
In one embodiment, the obtaining of the attention feature map specifically includes:
for the characteristics M of the original remote sensing image, belonging to RC×W×HUsing two 1 x 1 convolutional layers, two feature maps are generated, named Q and K, respectively, (Q, K) ∈ RC′×W×H;
For the feature mapping Q and the feature mapping K, sequentially carrying out Affinity operation, SoftMax operation and Aggregation operation to obtain an attention feature map A e R(H+W-1)×W×H;
Wherein c 'is a channel of the characteristic image, c is the number of channels of the original remote sensing image, and c' is smaller than c; H. and W is the height and the width of the original remote sensing image respectively.
In one embodiment, the Affinity operation, SoftMax operation, and Aggregation operation specifically include:
for each position u of the feature map Q, a vector Q is obtainedu∈RC′(ii) a While obtaining the set omega by extracting from K the eigenvectors in the same row or column as the position uuAnd has the following components:
wherein for Ωu∈R(H+W-1)×C′,Ωi,u∈RC′Represents omegauThe ith element in (1); di,uE.g. D represents the characteristic QuAnd Ωi,uThe degree of correlation between i ═ 1,. | Ωu|],D∈R(H+W-1)×W×H;
By channel dimension on DApplying SoftMax operation, applying a convolutional layer with 1 x 1 filter on M, and generating V epsilon R for feature adaptationC×W×H(ii) a And mapping V for each position on the feature space dimension to obtain a vector Vu∈RCAnd a set phiu∈R(H+W-1)×CAnd has the following components:
set phiuRepresenting a feature vector set in the feature map V in the same column or the same row with the position u; a. thei,uIs the position of scalar value channels i and u in A;
acquiring No-local information of the image through an Aggregation operation; wherein, M'uRepresents M' epsilon R in the output characteristic diagramC×W×HThe feature vector at position u.
In one embodiment, a semi-supervised remote sensing image semantic segmentation method further includes:
inputting the one-hot coding vectors of the semantic segmentation prediction graph and the annotation image into a discriminator network to obtain a semantic segmentation confidence image; wherein the original remote sensing image comprises: and (5) labeling the image.
In one embodiment, the discriminator network comprises:
5 convolution layers, the size of the convolution kernel is 4 multiplied by 4, the number of channels is [64, 128, 256, 512, 1] respectively, and the step length is 2; replacing the ReLU after the convolution layer with Leaky-ReLU; an upsampling layer is added to the last layer.
In one embodiment, a semi-supervised remote sensing image semantic segmentation method further includes:
space-based multi-class cross entropy LceAntagonistic loss function LASemi-supervised loss function LSAnd training the deep semantic segmentation network and the discriminator network.
In one embodiment, the training deep semantic segmentation network and the discriminator network specifically include:
multi-class loss function when using tagged dataNumber LceObtained by the following method:
through LDTraining the discriminator network:
when x isnWhen the pixel point is equal to 1, the generator generates the pixel point; if y isnIf 1, the sample is from the label image; d (G (X)n))(h,w)) Is a pixel XnA feature at the position of (h, w); d (Y)n)(h,w)Is a pixel YnA feature at the position of (h, w); if it is notTo a certain class classification, then Yn (h,w)Is 1, otherwise is 0;
fighting the learning process through loss LATo train the discriminator:
when training with unlabeled data, only LAWhere applicable, and for unlabeled data, confidence maps D (G (X) are generated by training the discriminator networkn))(h,w));
Y obtained by performing one-hot coding on annotated imagenThrough element-by-element setting, the method obtainsIf c is*=argmaxcG(Xn)(h,w,c)Then, thenSetting a threshold valueObtaining the areas with confidence by highlighting the confidence map; l isSThe definition is as follows:
i () is an index function, and control is performed by setting USThe sensitivity is controlled by the value of (a) to adjust the training process of the network.
A semi-supervised remote sensing image semantic segmentation device comprises:
the image acquisition module is used for acquiring an original remote sensing image;
the multi-scale attention feature map determining module is used for scaling the original remote sensing image into 3 scaled images with different sizes; respectively inputting the 3 scaled images into 3 criss-cross attention modules to obtain 3 attention feature maps; performing fusion processing on the 3 attention feature maps to obtain a multi-scale attention feature map;
and the semantic segmentation prediction map determining module is used for inputting the multi-scale attention feature map into the deep semantic segmentation network to obtain a semantic segmentation prediction map.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring an original remote sensing image;
scaling the original remote sensing image into 3 scaled images with different sizes; respectively inputting the 3 scaled images into 3 criss-cross attention modules to obtain 3 attention feature maps; performing fusion processing on the 3 attention feature maps to obtain a multi-scale attention feature map;
and inputting the multi-scale attention feature map into a deep semantic segmentation network to obtain a semantic segmentation prediction map.
Compared with the prior art, the semi-supervised remote sensing image semantic segmentation method, the semi-supervised remote sensing image semantic segmentation device and the computer equipment provided by the embodiment of the invention have the following beneficial effects:
the invention provides a multiscale attention-based semi-supervised remote sensing image semantic segmentation model, which can be used for training the whole model by using label-free data and fully utilizing the global context relationship among feature maps, thereby effectively improving the edge segmentation precision among remote sensing image targets and improving the integral accuracy. Specifically, in order to fully utilize the global context, long-distance correlation between pixel points is utilized, so that the segmentation precision of the target edge is improved, and a cross attention network is introduced. Through multipath input, image features with different sizes are extracted, and receptive fields with different sizes can be obtained, so that features of different visual angles of training data are fully utilized, and the training data are fully utilized. Meanwhile, in order to solve the problem of difficult semantic annotation, a semi-supervised semantic segmentation method is applied to a remote sensing image, a segmentation network is used as a generator, the output of the generator is as close to an annotated image as possible under the auxiliary training of a discriminator, because the FCN is greatly successful in the semantic segmentation of images in natural scenes, a plurality of scholars apply the FCN to the semantic segmentation of the remote sensing image, a full convolution discriminator is used for distinguishing the annotated image from a predicted image, and a semi-supervised framework can utilize marked data and unmarked data, so that the data can be fully utilized under the condition of small annotated data volume, and the segmentation effect is improved.
Drawings
FIG. 1 is a schematic illustration of an attention mechanism provided in one embodiment;
FIG. 2 is a multi-scale attention diagram provided in one embodiment;
FIG. 3 is a schematic diagram of a network of discriminators provided in one embodiment;
FIG. 4 is a diagram of a semi-supervised semantic segmentation based on multi-scale attention as provided in one embodiment;
FIG. 5 is a segmentation result visualization of a CCF2015 data set based on multi-scale attention provided in an embodiment;
fig. 6 is a visualization of segmentation results of the US2D dataset based on multi-scale attention generation provided in an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, the provided semi-supervised remote sensing image semantic segmentation method specifically comprises the following steps:
attention mechanism
The ability of a neural network to focus on features that it wants to focus on needs to introduce a mechanism of attention, which is also very important for studying the context between feature maps in semantic segmentation. Attention is not paid to the shape of the input data, and the range that can be used is wider. Under the condition that the computing resources are certain, the attention mechanism is taken as an effective means for solving the information overload problem, redundant information can be effectively removed, the computing resources can be allocated to the most needed place, most unimportant information can be ignored, and therefore the effective allocation of the computing resources is achieved. As shown in fig. 1.
Attention mechanism is used to calculate the degree of correlation between features, between Query and Key-Value { Ki,ViMapping | i ═ 1,2, ·, m } to output, where query, key, value are all vectors, weighting all values in output V, calculating query and key to obtain weight. The attention mechanism is calculated as follows:
firstly, the similarity of Q and K is calculated and compared, and is represented by f:
f(Q,Ki),i=1,2,...,m
the similarity between Q and K is calculated by the following four methods: point multiplication calculation, weight method calculation, splicing weight method calculation and sensor method calculation. Respectively as shown in the following formulas:
f(Q,Ki)=QTKi
f(Q,Ki)=QTWKi
f(Q,Ki)=W[Q;Ki]
f(Q,Ki)=VTtanh(WQ+UKi)
and normalizing the scores by utilizing SoftMax, and completing numerical value conversion to obtain the probability distribution with the sum of the weights of all elements being one. The specific operation is shown as the following formula:
according to alphaiAnd performing weighted summation calculation on all values in V to obtain an attention vector H. The specific operation is shown as the following formula:
multi-scale attention semi-supervised semantic segmentation model
Aiming at the problem that the generator does not pay attention to long-distance correlation, a crisscross attention module is introduced, and useful context information can be captured by utilizing long-distance dependence, so that the problem of visual understanding is facilitated. In the present invention, attention modules that attempt to criss-cross collect long distance related information in both horizontal and vertical directions for enhanced per-pixel functionality. And through multi-scale fusion, each pixel point can be combined with context information of three visual angles, so that the classification of the pixel point is more accurate.
Generator
As shown in fig. 2, the input image is scaled to 3 different sizes, and then the input image is respectively input to the attention module, and the obtained feature maps are merged to obtain a multi-scale attention feature map. As shown in FIG. 2, the input image is passed through a deep convolutional neural network, which is based on ResNet101, and then a feature map X is generated1,X2,X3. The spatial size of the feature map X is H × W. Characteristic diagram X1,X2,X3Respectively obtaining C through the attention module1,C2,C3Each pixel point in the characteristic diagram is related to all pixel points in the longitudinal direction and the transverse direction in a context mode. And then after the images are subjected to upsampling, all the feature maps are restored to the original size and then subjected to feature fusion to obtain a final multi-scale attention feature map.
Discriminator
For the discriminator network, there are 5 convolutional layers, the convolutional kernel size is 4 × 4, the number of channels is [64, 128, 256, 512, 1], the step length is 2, and the ReLU after convolutional layer is changed into Leaky-ReLU. In order to restore the output image to the size of the input image, an upsampling layer is added to the last layer. Since training of the generative challenge network requires a large memory space, a complex discriminator structure is not employed.
Semi-supervised remote sensing image semantic segmentation algorithm based on multi-scale attention
As shown in FIG. 2, for a given feature M ∈ RC×W×HFirst, two 1 × 1 convolutional layers are used for the feature M, so that two feature maps, named Q and K respectively, (Q, K) epsilon RC′×W×H. c' is the number of channels of the feature image, which is smaller than the number c of channels of the image, and the dimensionality of the feature image is reduced. After obtaining the profiles Q and K, further pass di,uAffinity is manipulated so that an attention feature map A ∈ R can be obtained(H+W-1)×W×H。
Affinity operates as follows: for each position u of the feature map Q, a vector Q is obtainedu∈RC′. Meanwhile, the set Ω may be obtained by extracting a feature vector in the same row or column as the position u from Ku。
For omegau∈R(H+W-1)×C′。Ωi,u∈RC′Represents omegauThe ith bit element of (1). Wherein d isi,uE.g. D represents the characteristic QuAnd Ωi,uThe degree of correlation between i ═ 1,. | Ωu|],D∈R(H+W-1)×W×H. Then, applying SoftMax operations on D by channel dimension, an attention map is calculated. Finally, a convolutional layer with 1 x 1 filter is applied to M to generate V e R that can be used for feature adaptationC×W×H. For each location mapping V in the feature space dimension, a vector V can be obtainedu∈RCAnd a set phiu∈R(H+W-1)×C。
Set phiuThe feature vector set in the feature map V in the same column or the same row as the position u is shown. Obtaining No-local information of the image through Aggregation operation, wherein M'uRepresents M' epsilon R in the output characteristic diagramC×W×HThe feature vector at position u. A. thei,uIs the position of scalar value channels i and u in a. Since the context information is added to the local feature M, the local feature can be paid better attention, and the pixel-level representation of the feature can be improved. Since the attention map may focus on long distance correlations, the feature map has a relatively broad contextual view, and thus context information may be selectively aggregated based on the attention map. As shown in fig. 4, the attention feature maps obtained by the input pictures with different sizes through the cross attention module are up-sampled and restored to the size same as that of the original input picture, and then are fused to obtain a final multi-scale attention feature map.
Prediction graph G (X) of generatorn)(h,w)And a vector Y obtained by the one-hot coding of the annotation imagenAfter being input into the discriminator, the confidence coefficient map with the size of H multiplied by W multiplied by 1 is output after training. Three losses in LLoss function:
L=Lce+λALA+λSLS
Lce,LAand LSRespectively, a multi-class cross entropy loss function, a countermeasure loss function and a semi-supervised loss function. Lambda [ alpha ]A,λSAre two weights for minimizing the loss function L. When using the label data, the multi-class penalty function LceCan be obtained by the following method:
through LDTraining the discriminator network:
when x isnWhen the value is 1, the representative pixel point is generated by the generator. If y isnIf 1, then the sample is from the label image. D (G (X)n))(h,w)) Is a pixel XnA feature at the position of (h, w). D (Y)n)(h,w)Are defined similarly. In order to convert the discrete label map into the C-channel probability map, the labeling image is transformed by the one-hot coding. If it is notA classification rule Y of a certain categoryn (h,w)Is 1, otherwise is 0. The antagonistic learning process is through loss of LATo the trained discriminator:
the generator network is trained to fool the discriminator by increasing the probability of generating a prediction from the true distribution of the annotated images. When training with unlabeled data, only LASince it only requires a discriminatorA network. At this time, since no labeled image exists, the multi-class cross entropy loss function cannot be utilized. In addition, for unlabeled data, confidence maps D (G (X) can be generated by training the discriminatorsn))(h,w)) It can be used to infer which regions are sufficiently close to the true distribution of the annotated image. Can be used to predict the distribution of regions similar to actual values and used as a marker picture. The marked image is subjected to one-hot coding to obtain YnIs arranged element by element to obtainIf c is*=argmaxcG(Xn)(h,w,c)ThenLsAnd LceSimilarly. Then setting a threshold valueTo highlight areas of confidence in the confidence map. L isSThe definition is as follows:
i () is an index function which can be controlled by setting USTo control its sensitivity. Thereby adjusting the training process of the network.
Example analysis
The experiments are all carried out on an Ubuntu 18.04 operating system, and the semi-supervised remote sensing image semantic segmentation based on multi-scale attention is trained by using RTX 2080 ti. The code for all experiments was based on extracting image features using pytorch0.4.0, cuda9, selecting a network model pre-trained with ResNet101 on a pascal voc2012 dataset, using generative confrontation network training aids. For the generator i.e. the split network part,the optimizer adopted during model training is Adam, initial learning rate and L2The coefficients of the regularization terms are all set to 0.0001. For the discriminator network, an initial learning rate of 10 is used for the Adam optimizer-4,L2The coefficient of the regularization term is 0.9. When using annotation data, λAIs set to 0.01, λ when the annotation data is not usedAIs set to 0.001. Lambda [ alpha ]STake 0.1, USSet to 0.2 and I to 0.1. The training period (training epoch) herein is set to 20000, and since the memory required for the generative countermeasure network is large, the size of the trained batch size is set to 2. In the experiment, a Mean Intersection Over Union (MIOU) is used as an evaluation criterion for the quality of the generated segmentation image. The higher the evaluation criterion, the closer the generated description text is to the real annotation description, i.e. the higher the quality of the description text. And calculating the ratio of the intersection and union of the two sets of the real value and the predicted value. This ratio can be transformed into the sum (union) of TP (intersection) over TP, FP, FN. Namely: MIOU is TP/(FP + FN + TP).
pijRepresenting the true value i, predicted as the number of j, and k +1 is the number of classes (including empty classes). p is a radical ofiiIs a true quantity. p is a radical ofij、pjiFalse positive and false negative are indicated, respectively.
Example analysis on CCF2015 dataset
The CCF2015 data set contains five classes. There are five maps and four objects marked therein: vegetation, buildings, water, roads, etc. The original picture data set ranges in size from 3000 × 3000 to 6000 × 6000. The original image needs to be processed because of its too large resolution. 13000 images of 256 × 256 size are obtained from the processed image. The specific treatment method is as follows: both the original image and the label image need to be rotated: 90 degrees, 180 degrees, 270 degrees, which are cropped by randomly generating x and y coordinates and then to a thumbnail 256 × 256. A standard validation set of 1000 images was used for the model evaluated.
The experiments performed on CCF2015 based on multi-scale semi-supervision of attention are shown in tables 1, 2. In order to prove that the multiscale generated confrontation network remote sensing image semantic segmentation method has better performance than the existing method, the multiscale attention-based semi-supervised remote sensing image semantic segmentation method is further compared with the fully supervised deep b and semi-supervised Hung methods. As shown in tables 1 to 5, the semi-supervised remote sensing image semantic segmentation method based on multi-scale attention has a larger increase in MIOU compared with a network without long distance correlation introduced before. It shows that introducing multi-scale attention, enhancing the context correlation between pixels is of great significance. The idea is fully proven by experiments on CCF2015 and US2D data sets. The process was tested on 1/8 and 1/2, respectively, on CCF 2015. In order to further verify the feasibility of the method, as shown in fig. 5, the semantic segmentation of the multi-scale attention remote sensing image is improved in both the segmentation effect of the road (blue) and the vegetation (red) and the accuracy of the segmentation of the building (green), and the edge information of the target is closer to the original label. The method for semi-supervised remote sensing image semantic segmentation based on multi-scale attention is proved to be combined with the effectiveness of attention, and the context relation between the longitudinal line and the transverse line of the attention pixel point and other pixel points is paid to, so that the long-distance correlation is better paid to, the feature extraction capability of the generator is further improved, and the performance of the whole network is further improved.
Table 1 experimental results of multi-scale attention on CCF2015 dataset of 1/8
Table 3 experimental results of multi-scale attention on CCF2015 dataset of 1/2
Example analysis on US2D data set
The city semantic two-dimensional (US2D) dataset for the IGARSS2019 race is a large public dataset containing RGB maps and semantic labels. The US2D dataset covers jackson verl, florida and omaha, nebraska. The Geoscience And Remote Sensing congress (igars) is an influential conference in the field of Remote Sensing. For the data processing of the US2D data set, the image obtained by cutting with the remote sensing 512 × 512 resolution is shown in fig. 5, and the semantic label of the corresponding original image, and the Ground Sampling Distance (GSD) is about 30 cm. For the experiments herein, the crop yielded 13732 training images and 1720 images tested.
The results based on the US2D data set are shown in tables 3 and 4. Experiments were performed on the US2D dataset, with 1/8 and 1/2 labeled data, respectively, and the remainder as unlabeled data. Compared with the existing full-supervision method, the method is found to be greatly improved, and compared with the existing excellent semi-supervision method, the method is also improved to a certain extent. The visualization result on US2D is shown in fig. 6, and it can be seen that the obtained semantic segmentation image can well reflect semantic features by combining the attention-gaining mechanism with the semi-supervised method.
Experimental results of multiscale attention on the US2D dataset of Table 41/8
Experimental results of multiscale attention on the US2D dataset of Table 51/2
In a word, in recent years, the application of a deep convolutional network on remote sensing images is more and more extensive, and aiming at the problem that the existing semi-supervised remote sensing image semantic segmentation method does not concern long-distance correlation between pixels, so that the global context can not be effectively utilized, a cross attention mechanism is introduced, a multi-scale attention module is designed, a generative countermeasure network is combined, the whole network is trained under a semi-supervised framework, the training effect can be improved by using label-free data in data concentration, and the remote sensing image semantic segmentation precision is improved. The effectiveness of the method provided by the invention is verified through experiments on two public remote sensing data sets.
In addition, semantic segmentation is a very important research field of computer vision, and is different from image classification in the field of computer vision, for example, image classification only needs to be performed on each picture, and the semantic segmentation judges the category of each pixel point in an image so as to perform accurate segmentation. Thus, semantic segmentation of images requires much larger annotations than classification of images. It is therefore meaningful to study semantic segmentation with a small amount of labeled data by semi-supervised studies. Since the semantic segmentation of the image is at a pixel level, the specific contour of an object can be well represented through the semantic segmentation, and a target to which each specific pixel belongs is pointed out, so that accurate segmentation is achieved, which is also very helpful for the research of remote sensing images. The remote sensing image has great research value in many fields at present, realizes the semantic segmentation of the remote sensing image, and has profound significance for acquiring the space geographic information and the like needed by people by better utilizing the remote sensing image. The invention provides a semi-supervised remote sensing image semantic division method based on a deep convolutional network and countercheck learning, which aims at the problems that a training model in a remote sensing image needs a large amount of labeled data for training, the deep convolutional model does not concern long-distance correlation, the remote sensing image is difficult to label and the labeled data amount is small, and aims at the problem that the countercheck network remote sensing image semantic division method based on the semi-supervised multi-scale generation has no concern about long-distance correlation. By utilizing input pictures with different scales, the multi-view characteristics of the images are collected, and the semantic segmentation effect of the remote sensing images is further improved. The invention provides a semi-supervised remote sensing image semantic segmentation method combining a deep convolutional neural network and a generative countermeasure network aiming at a remote sensing image semantic segmentation task, difficult acquisition of remote sensing image data under certain conditions, small labeled data amount and great manpower and material resources spent on labeled data reduction as much as possible, and then provides semi-supervised remote sensing image semantic segmentation based on multi-scale attention aiming at extracting features of the current method and irrelevant context correlation, so that the current optimal performance is realized.
In one embodiment, a semi-supervised remote sensing image semantic segmentation device is provided, which comprises:
and the image acquisition module is used for acquiring the original remote sensing image.
The multi-scale attention feature map determining module is used for scaling the original remote sensing image into 3 scaled images with different sizes; respectively inputting the 3 scaled images into 3 criss-cross attention modules to obtain 3 attention feature maps; and carrying out fusion processing on the 3 attention feature maps to obtain a multi-scale attention feature map.
And the semantic segmentation prediction map determining module is used for inputting the multi-scale attention feature map into the deep semantic segmentation network to obtain a semantic segmentation prediction map.
The semantic segmentation confidence image determining module is used for inputting the one-hot coding vectors of the semantic segmentation prediction image and the annotation image into a discriminator network to obtain a semantic segmentation confidence image; wherein the original remote sensing image comprises: and (5) labeling the image.
A network training module for cross entropy L based on space multi-classceAntagonistic loss function LASemi-supervised loss function LSTraining the deep semantic segmentation network and the discriminator network.
The specific definition of the semi-supervised remote sensing image semantic segmentation device can refer to the definition of the semi-supervised remote sensing image semantic segmentation method in the above, and is not described herein again. All modules in the semi-supervised remote sensing image semantic segmentation device can be completely or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
and acquiring an original remote sensing image.
Scaling the original remote sensing image into 3 scaled images with different sizes; respectively inputting the 3 scaled images into 3 criss-cross attention modules to obtain 3 attention feature maps; and carrying out fusion processing on the 3 attention feature maps to obtain a multi-scale attention feature map.
And inputting the multi-scale attention feature map into a deep semantic segmentation network to obtain a semantic segmentation prediction map.
Inputting the one-hot coding vectors of the semantic segmentation prediction graph and the annotation image into a discriminator network to obtain a semantic segmentation confidence image; wherein the original remote sensing image comprises: and (5) labeling the image.
Space-based multi-class cross entropy LceAntagonistic loss function LASemi-supervised loss function LSTraining the deep semantic segmentation network and the discriminator network.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features. Furthermore, the above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A semi-supervised remote sensing image semantic segmentation method is characterized by comprising the following steps:
acquiring an original remote sensing image;
scaling the original remote sensing image into 3 scaled images with different sizes; respectively inputting the 3 scaled images into 3 criss-cross attention modules to obtain 3 attention feature maps; performing fusion processing on the 3 attention feature maps to obtain a multi-scale attention feature map;
and inputting the multi-scale attention feature map into a deep semantic segmentation network to obtain a semantic segmentation prediction map.
2. The semi-supervised remote sensing image semantic segmentation method according to claim 1, wherein the obtaining of the multi-scale attention feature map specifically comprises:
inputting the original remote sensing image into a deep convolution neural network to obtain characteristic images X with different sizes1And characteristic diagram X2And characteristic diagram X3;
Will feature diagram X1And characteristic diagram X2And characteristic diagram X3Respectively inputting into 3 criss-cross attention modules to obtain attention feature map C1Attention feature chart C2Attention feature chart C3;
To attention feature C1Attention feature chart C2Attention feature chart C3And sequentially carrying out up-sampling and fusion processing to obtain a multi-scale attention feature map.
3. The semi-supervised remote sensing image semantic segmentation method according to claim 1, wherein the obtaining of the attention feature map specifically comprises:
for the characteristics M of the original remote sensing image, belonging to RC×W×HUsing two 1 x 1 convolutional layers, two feature maps are generated, named Q and K, respectively, (Q, K) ∈ RC′×W×H;
For the feature mapping Q and the feature mapping K, sequentially carrying out Affinity operation, SoftMax operation and Aggregation operation to obtain an attention feature map A e R(H+W-1)×W×H;
Wherein c 'is a channel of the characteristic image, c is the number of channels of the original remote sensing image, and c' is smaller than c; H. and W is the height and the width of the original remote sensing image respectively.
4. The semi-supervised remote sensing image semantic segmentation method according to claim 3, wherein the Affinity operation, SoftMax operation and Aggregation operation specifically comprise:
for each position u of the feature map Q, a vector Q is obtainedu∈RC′(ii) a By extracting AND bits from K at the same timeThe feature vectors of u in the same row or column are arranged to obtain the set omegauAnd has the following components:
wherein for Ωu∈R(H+W-1)×C′,Ωi,u∈RC′Represents omegauThe ith element in (1); di,uE.g. D represents the characteristic QuAnd Ωi,uThe degree of correlation between i ═ 1,. | Ωu|],D∈R(H+W-1)×W×H;
Applying SoftMax operations on D by channel dimension and a convolutional layer with 1 x 1 filter on M to generate V e R for feature adaptationC×W×H(ii) a And mapping V for each position on the feature space dimension to obtain a vector Vu∈RCAnd a set phiu∈R(H+W-1)×CAnd has the following components:
set phiuRepresenting a feature vector set in the feature map V in the same column or the same row with the position u; a. thei,uIs the position of scalar value channels i and u in A;
acquiring No-local information of the image through an Aggregation operation; wherein M isu'represents M' epsilon R in the output characteristic diagramC ×W×HThe feature vector at position u.
5. The semi-supervised remote sensing image semantic segmentation method of claim 1, further comprising:
inputting the one-hot coding vectors of the semantic segmentation prediction graph and the annotation image into a discriminator network to obtain a semantic segmentation confidence image; wherein the original remote sensing image comprises: and (5) labeling the image.
6. The semi-supervised remote sensing image semantic segmentation method of claim 5, wherein the discriminator network comprises:
5 convolution layers, the size of the convolution kernel is 4 multiplied by 4, the number of channels is [64, 128, 256, 512, 1] respectively, and the step length is 2; replacing the ReLU after the convolution layer with Leaky-ReLU; an upsampling layer is added to the last layer.
7. The semi-supervised remote sensing image semantic segmentation method of claim 1, further comprising:
space-based multi-class cross entropy LceAntagonistic loss function LASemi-supervised loss function LSAnd training the deep semantic segmentation network and the discriminator network.
8. The semi-supervised remote sensing image semantic segmentation method according to claim 7, wherein the training of the deep semantic segmentation network and the discriminator network specifically comprises:
when using the label data, the multi-class penalty function LceObtained by the following method:
through LDTraining the discriminator network:
when x isnWhen the pixel point is equal to 1, the generator generates the pixel point; if y isnIf 1, the sample is from the label image; d (G (X)n))(h,w)) Is a pixel XnA feature at the position of (h, w); d (Y)n)(h,w)Is a pixel YnA feature at the position of (h, w); if it is notTo a certain class classification, then Yn (h,w)Is 1, otherwise is 0;
fighting the learning process through loss LATo train the discriminator:
when training with unlabeled data, only LAWhere applicable, and for unlabeled data, confidence maps D (G (X) are generated by training the discriminator networkn))(h,w));
Y obtained by performing one-hot coding on annotated imagenThrough element-by-element setting, the method obtainsIf c is*=argmaxcG(Xn)(h,w,c)Then, thenSetting a threshold valueObtaining the areas with confidence by highlighting the confidence map; l isSThe definition is as follows:
i () is an index function, and control is performed by setting USThe sensitivity is controlled by the value of (a) to adjust the training process of the network.
9. A semi-supervised remote sensing image semantic segmentation device is characterized by comprising:
the image acquisition module is used for acquiring an original remote sensing image;
the multi-scale attention feature map determining module is used for scaling the original remote sensing image into 3 scaled images with different sizes; respectively inputting the 3 scaled images into 3 criss-cross attention modules to obtain 3 attention feature maps; performing fusion processing on the 3 attention feature maps to obtain a multi-scale attention feature map;
and the semantic segmentation prediction map determining module is used for inputting the multi-scale attention feature map into the deep semantic segmentation network to obtain a semantic segmentation prediction map.
10. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program implements the steps of the method of any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110686544.8A CN113298815A (en) | 2021-06-21 | 2021-06-21 | Semi-supervised remote sensing image semantic segmentation method and device and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110686544.8A CN113298815A (en) | 2021-06-21 | 2021-06-21 | Semi-supervised remote sensing image semantic segmentation method and device and computer equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113298815A true CN113298815A (en) | 2021-08-24 |
Family
ID=77329003
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110686544.8A Withdrawn CN113298815A (en) | 2021-06-21 | 2021-06-21 | Semi-supervised remote sensing image semantic segmentation method and device and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113298815A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113780296A (en) * | 2021-09-13 | 2021-12-10 | 山东大学 | Remote sensing image semantic segmentation method and system based on multi-scale information fusion |
CN113989585A (en) * | 2021-10-13 | 2022-01-28 | 北京科技大学 | Medium-thickness plate surface defect detection method based on multi-feature fusion semantic segmentation |
CN113989662A (en) * | 2021-10-18 | 2022-01-28 | 中国电子科技集团公司第五十二研究所 | Remote sensing image fine-grained target identification method based on self-supervision mechanism |
CN114022762A (en) * | 2021-10-26 | 2022-02-08 | 三峡大学 | Unsupervised domain self-adaption method for extracting area of crop planting area |
CN114972293A (en) * | 2022-06-14 | 2022-08-30 | 深圳市大数据研究院 | Video polyp segmentation method and device based on semi-supervised spatio-temporal attention network |
CN115222629A (en) * | 2022-08-08 | 2022-10-21 | 西南交通大学 | Single remote sensing image cloud removing method based on cloud thickness estimation and deep learning |
CN115375677A (en) * | 2022-10-24 | 2022-11-22 | 山东省计算中心(国家超级计算济南中心) | Wine bottle defect detection method and system based on multi-path and multi-scale feature fusion |
CN115496732A (en) * | 2022-09-26 | 2022-12-20 | 电子科技大学 | Semi-supervised heart semantic segmentation algorithm |
CN116129117A (en) * | 2023-02-03 | 2023-05-16 | 中国人民解放军海军工程大学 | Sonar small target semi-supervised semantic segmentation method and system based on multi-head attention |
-
2021
- 2021-06-21 CN CN202110686544.8A patent/CN113298815A/en not_active Withdrawn
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113780296A (en) * | 2021-09-13 | 2021-12-10 | 山东大学 | Remote sensing image semantic segmentation method and system based on multi-scale information fusion |
CN113780296B (en) * | 2021-09-13 | 2024-02-02 | 山东大学 | Remote sensing image semantic segmentation method and system based on multi-scale information fusion |
CN113989585A (en) * | 2021-10-13 | 2022-01-28 | 北京科技大学 | Medium-thickness plate surface defect detection method based on multi-feature fusion semantic segmentation |
CN113989585B (en) * | 2021-10-13 | 2022-08-26 | 北京科技大学 | Medium-thickness plate surface defect detection method based on multi-feature fusion semantic segmentation |
CN113989662A (en) * | 2021-10-18 | 2022-01-28 | 中国电子科技集团公司第五十二研究所 | Remote sensing image fine-grained target identification method based on self-supervision mechanism |
CN114022762A (en) * | 2021-10-26 | 2022-02-08 | 三峡大学 | Unsupervised domain self-adaption method for extracting area of crop planting area |
CN114972293A (en) * | 2022-06-14 | 2022-08-30 | 深圳市大数据研究院 | Video polyp segmentation method and device based on semi-supervised spatio-temporal attention network |
CN115222629A (en) * | 2022-08-08 | 2022-10-21 | 西南交通大学 | Single remote sensing image cloud removing method based on cloud thickness estimation and deep learning |
CN115496732A (en) * | 2022-09-26 | 2022-12-20 | 电子科技大学 | Semi-supervised heart semantic segmentation algorithm |
CN115496732B (en) * | 2022-09-26 | 2024-03-15 | 电子科技大学 | Semi-supervised heart semantic segmentation algorithm |
CN115375677A (en) * | 2022-10-24 | 2022-11-22 | 山东省计算中心(国家超级计算济南中心) | Wine bottle defect detection method and system based on multi-path and multi-scale feature fusion |
CN116129117A (en) * | 2023-02-03 | 2023-05-16 | 中国人民解放军海军工程大学 | Sonar small target semi-supervised semantic segmentation method and system based on multi-head attention |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113298815A (en) | Semi-supervised remote sensing image semantic segmentation method and device and computer equipment | |
US11200424B2 (en) | Space-time memory network for locating target object in video content | |
CN111931684B (en) | Weak and small target detection method based on video satellite data identification features | |
CN111259786B (en) | Pedestrian re-identification method based on synchronous enhancement of appearance and motion information of video | |
CN111612008B (en) | Image segmentation method based on convolution network | |
CN111860235B (en) | Method and system for generating high-low-level feature fused attention remote sensing image description | |
CN112597941B (en) | Face recognition method and device and electronic equipment | |
CN113780149B (en) | Remote sensing image building target efficient extraction method based on attention mechanism | |
CN111369581A (en) | Image processing method, device, equipment and storage medium | |
CN113591968A (en) | Infrared weak and small target detection method based on asymmetric attention feature fusion | |
CN111738055B (en) | Multi-category text detection system and bill form detection method based on same | |
CN113609896A (en) | Object-level remote sensing change detection method and system based on dual-correlation attention | |
Chen et al. | Corse-to-fine road extraction based on local Dirichlet mixture models and multiscale-high-order deep learning | |
CN112150493A (en) | Semantic guidance-based screen area detection method in natural scene | |
CN113111716B (en) | Remote sensing image semiautomatic labeling method and device based on deep learning | |
CN111723660A (en) | Detection method for long ground target detection network | |
CN113505670A (en) | Remote sensing image weak supervision building extraction method based on multi-scale CAM and super-pixels | |
CN115410081A (en) | Multi-scale aggregated cloud and cloud shadow identification method, system, equipment and storage medium | |
CN116596966A (en) | Segmentation and tracking method based on attention and feature fusion | |
Malav et al. | DHSGAN: An end to end dehazing network for fog and smoke | |
CN115797929A (en) | Small farmland image segmentation method and device based on double-attention machine system | |
CN115035599A (en) | Armed personnel identification method and armed personnel identification system integrating equipment and behavior characteristics | |
CN114550014A (en) | Road segmentation method and computer device | |
CN113066089B (en) | Real-time image semantic segmentation method based on attention guide mechanism | |
CN114494699A (en) | Image semantic segmentation method and system based on semantic propagation and foreground and background perception |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20210824 |