CN113160234B - Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation - Google Patents
Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation Download PDFInfo
- Publication number
- CN113160234B CN113160234B CN202110530385.2A CN202110530385A CN113160234B CN 113160234 B CN113160234 B CN 113160234B CN 202110530385 A CN202110530385 A CN 202110530385A CN 113160234 B CN113160234 B CN 113160234B
- Authority
- CN
- China
- Prior art keywords
- resolution
- module
- image
- super
- domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 222
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000012549 training Methods 0.000 claims abstract description 95
- 238000012360 testing method Methods 0.000 claims abstract description 22
- 238000005457 optimization Methods 0.000 claims abstract description 11
- 230000006870 function Effects 0.000 claims description 84
- 238000000605 extraction Methods 0.000 claims description 32
- 238000011176 pooling Methods 0.000 claims description 21
- 238000010586 diagram Methods 0.000 claims description 20
- 238000005070 sampling Methods 0.000 claims description 20
- 238000004364 calculation method Methods 0.000 claims description 16
- 230000004913 activation Effects 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 11
- 230000004927 fusion Effects 0.000 claims description 9
- 238000007781 pre-processing Methods 0.000 claims description 8
- 230000006978 adaptation Effects 0.000 claims description 4
- 230000010354 integration Effects 0.000 claims description 4
- 238000005192 partition Methods 0.000 claims description 4
- 238000012546 transfer Methods 0.000 claims description 3
- 230000006872 improvement Effects 0.000 abstract description 2
- 238000012545 processing Methods 0.000 abstract description 2
- 230000000694 effects Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 101100295091 Arabidopsis thaliana NUDT14 gene Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000011800 void material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a non-supervision remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation, belonging to the technical field of remote sensing image semantic segmentation methods; the technical problem to be solved is as follows: the improvement of the unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation is provided; the technical scheme for solving the technical problems is as follows: the method comprises the following steps: acquiring a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set, and dividing the acquired target domain image data set into a training image and a test image according to a set proportion; building a remote sensing image semantic segmentation network and a super-resolution network; performing network pre-training and parameter optimization on the built super-resolution network; training a semantic segmentation network of the remote sensing image; inputting the preprocessed test set data into a trained remote sensing image semantic segmentation network, and outputting an accurate segmentation result of the remote sensing image; the invention is applied to remote sensing image processing.
Description
Technical Field
The invention relates to an improved unsupervised high-resolution remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation, and belongs to the technical field of remote sensing image semantic segmentation methods.
Background
In recent years, with continuous progress and wide application of high-resolution earth observation technology, the spatial resolution of high-resolution remote sensing data is continuously improved and accumulated in a geometric progression, so that how to automatically, quickly and accurately extract high-value geographic information from a high-resolution remote sensing image becomes one of important problems which need to be solved urgently. Semantic segmentation marks each pixel in an image as a specific ground feature type, also called ground feature extraction or land classification, is one of important means for extracting information of a high-resolution remote sensing image, and is widely applied to the fields of land planning, environment monitoring, disaster assessment and the like.
The deep neural network can automatically extract semantic information of each grade from an image, and has strong feature expression capability, so that the deep neural network has great success in image semantic segmentation application at present. The excellent performance of these deep learning based semantic segmentation methods relies on remote sensing image labels labeled at the millions of pixel level. Because manual labeling of high-resolution remote sensing images is time-consuming, labor-consuming and requires abundant professional knowledge, the current semantic segmentation model in the field only depends on a small-scale training set acquired by a specific time period, a limited individual region and a specific remote sensing detector. This leads to these models and their limited generalization performance, and the segmentation accuracy is greatly reduced when the models are applied to different regions or probes, i.e. Domain Shifts (Domain Shifts). In order to solve the domain migration and fully utilize the existing data set, the unsupervised domain self-adaptive method realizes the semantic segmentation task of the unlabeled target domain data set by migrating the knowledge learned in the source domain data set, wherein the domain self-adaptive method adopting the countermeasure generation network learns the domain invariant features through the countermeasures of the generator and the discriminator, and the inter-domain difference can be effectively reduced.
The method is different from most of current high-Resolution remote sensing image Domain self-adaptive methods, only style difference, namely spectrum difference, between different detectors is considered, an unsupervised semantic segmentation method based on Super-Resolution Domain Adaptation (SRDA) notices that remote sensing images obtained by different detectors have difference in Resolution, meanwhile, different types of ground objects have different sizes in the remote sensing images, and the unsupervised Domain self-adaptive semantic segmentation of the remote sensing images is realized by utilizing the characteristics that the Super-Resolution and semantic segmentation multitask and multi-scale generation confrontation network learns spectrum and scale invariance at the same time. The model utilizes an porous Spatial Pyramid Pooling Module (atomic Spatial Pyramid Pooling Module) to extract multi-scale features or scale invariant features, but the void Convolution (scaled Convolution) is a sparse calculation and may cause Grid Artifacts (Grid Artifacts); while the spatial pyramid pooling module may cause pixel-level positioning information to be lost. In addition, semantic segmentation of high-resolution remote sensing images often faces a serious variety imbalance problem.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention aims to solve the technical problems that: the method provides an improvement of a semantic segmentation method of the unsupervised remote sensing image based on super-resolution and domain self-adaptation.
In order to solve the technical problems, the invention adopts the technical scheme that: the unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation comprises the following steps:
the method comprises the following steps: acquiring a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set, and dividing the acquired target domain image data set into a training image and a test image according to a set proportion;
forming a training set of image semantic segmentation by using the source domain image data set and the target domain training image, and forming a test set of image semantic segmentation by using the test image of the target domain;
preprocessing the remote sensing image data of the training set to obtain a remote sensing image data set subjected to data enhancement;
step two: building a remote sensing image semantic segmentation network, wherein the super-resolution remote sensing image semantic segmentation network comprises a feature coding module, a super-resolution domain distinguishing module, a semantic segmentation module and a semantic segmentation domain distinguishing module, and the feature coding module and the super-resolution module jointly form the super-resolution network of the remote sensing image;
step three: performing network pre-training and parameter optimization on the super-resolution network built in the step two;
step four: training a semantic segmentation network of the remote sensing image;
step five: and inputting the preprocessed test set data into the trained semantic segmentation network of the remote sensing image in the fourth step, and outputting an accurate segmentation result of the remote sensing image.
And acquiring a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set in the first step through a remote sensing satellite, wherein the source domain low-resolution remote sensing image data set comprises a low-resolution original image and a label data image which is artificially marked, and the target domain high-resolution remote sensing image data set comprises a high-resolution original image.
Preprocessing the source domain image data set and the target domain training set image remote sensing image data in the first step specifically comprises image cutting, image sampling and data enhancement of original images in a training set;
the image clipping specifically comprises the following steps: cutting the original image of the source domain and the label data image into images with 256 pixels multiplied by 256 pixels and the resolution of 1 meter per pixel; cutting the target domain training set and the test set image into images with 512 pixels multiplied by 512 pixels and 0.5 meter per pixel resolution;
the image sampling specifically comprises: up-sampling the original image of the source domain and the label data image to obtain a high-resolution image with 512 pixels multiplied by 512 pixels and the resolution of 0.5 meter per pixel; down-sampling the target domain image to obtain a low-resolution image with 256 pixels multiplied by 256 pixels and the resolution of 1 meter per pixel;
the data enhancement comprises: and carrying out image rotation, image vertical and horizontal overturning and image size adjustment on the images in the semantic segmentation training set of the remote sensing images.
The second step of building the super-resolution remote sensing image semantic segmentation network comprises the following steps:
step 2.1: inputting the image into a feature coding module to obtain multi-scale and multi-level image features: the feature coding module realizes the hierarchical feature extraction of the image from the bottom level detail feature to the high level semantic feature through convolution and maximum pooling operation, and specifically comprises the following steps: performing convolution and maximum pooling operation on the image for three times to extract bottom-layer features of the network, extracting multi-scale features of the image through a residual feature pyramid attention module, fusing the extracted multi-scale features of the image through a residual network module, and repeating the extraction and fusion of the multi-scale features for two times to finally obtain rich image features with multiple scales and multiple levels;
step 2.2: 2.1, the image features extracted by the feature coding module in the step 2.1 are processed by the super-resolution module to obtain a high-resolution image with the enlarged size of the original image, then the image generated by the super-resolution module is input into the super-resolution domain judging module, and the domain to which the input image belongs is judged to optimize the parameters of the super-resolution module and the feature coding module;
the feature coding module and the super-resolution module jointly form a super-resolution network of the remote sensing image, and the super-resolution network is combined with the super-resolution domain distinguishing module to jointly realize the super-resolution of the low-resolution image;
step 2.3: the features extracted by the feature coding module and the super-resolution module trained in the step 2.2 are taken as the input of the semantic segmentation module together: the features extracted by the feature coding module and part of the features in the image super-resolution module are input into a semantic segmentation module of the image together to realize semantic segmentation of the remote sensing image data;
the system comprises a feature extraction module, a super-resolution module, a semantic segmentation module and a semantic segmentation module, wherein the feature extraction module, the super-resolution module and the semantic segmentation module jointly form a semantic segmentation network of an image, a high-resolution remote sensing image and a probability map generated by the semantic segmentation module are spliced and then input into the semantic segmentation domain discrimination module to optimize the semantic segmentation network of the image, and finally, the segmentation function of the remote sensing image is realized.
The third step specifically comprises:
step 3.1: carrying out parameter random initialization on the feature coding module and the super-resolution module, inputting the training set data preprocessed in the step one into the remote sensing image super-resolution network in the step two to generate a high-resolution image, and calculating super-resolution loss;
inputting the generated high-resolution image and the original high-resolution image into a super-resolution domain discrimination module, discriminating the domain of the input image, and calculating the discrimination loss of the super-resolution domain;
step 3.2: loss reverse propagation is performed, parameters of a super-resolution network and a super-resolution domain discrimination module are alternately optimized, and super resolution of the low-resolution image is finally achieved;
step 3.3: and after the training is finished, storing the parameters of the trained feature coding module, the super-resolution module and the super-resolution domain distinguishing module.
The fourth step specifically comprises:
step 4.1: initializing a feature coding module, a super-resolution module and a super-resolution domain distinguishing module in the semantic segmentation network by using the model parameters saved in the third step, simultaneously performing random initialization on the parameters of the semantic segmentation module and the semantic segmentation domain distinguishing module, inputting the training set data preprocessed in the first step into the remote sensing image semantic segmentation network in the second step, generating a semantic segmentation probability map of the remote sensing image, and calculating semantic segmentation loss;
splicing the high-resolution remote sensing image and the semantic segmentation probability map, and inputting the spliced high-resolution remote sensing image and the semantic segmentation probability map into a semantic segmentation domain discrimination module to realize discrimination of a domain to which the semantic segmentation network generation probability map belongs and calculate discrimination loss of the semantic segmentation domain;
step 4.2: loss back propagation, namely alternately optimizing parameters of the semantic segmentation network and the two domain discrimination modules, and finally finishing the optimization of the parameters of the semantic segmentation network by taking the minimization of a loss function as an optimization target;
step 4.3: and after the training is finished, storing the trained semantic segmentation network model parameters.
The network structure of the feature coding module in step 2.1 is as follows:
the first, second and third layers are convolution layers: performing convolution with convolution kernel size of 3 × 3 and step size of 1;
the fourth layer is a maximum pooling layer: the largest pooling layer with the step length of 2 is arranged behind the convolution layer;
the feature coding module is provided with two repeated residual feature pyramid attention modules and a residual module for realizing multi-scale feature fusion behind the maximum pooling layer;
the residual error feature pyramid attention module is composed of three continuous feature pyramid attention module networks containing residual error connection, the feature pyramid attention module is divided into two paths, the first path adopts global pooling for input features, convolution with convolution kernel of 1 × 1 and an upper sampling layer to achieve feature transfer of the network, the second path adopts a U-shaped network structure to achieve multi-layer feature extraction, convolution operation with step length of 2 is conducted on the features three times to obtain feature maps with different sizes of input feature sizes 1/2, 1/4 and 1/8, the convolution kernels of the convolution three times are respectively 7 × 7, 5 × 5 and 3 × 3 in size, then the feature map with size of 1/8 is sampled and overlapped with the feature map with size of 1/4, the steps are repeated twice, and finally the output feature map and the feature map which is subjected to convolution with size of 1 × 1 are multiplied pixel by pixel to obtain the feature map with the same size as the input feature map Figure representation; finally, overlapping the characteristic diagrams of the two paths to obtain a multi-scale characteristic diagram;
and the two convolution operations in the residual error module realize the characteristic channel fusion by the convolution with the step length of 1 and the convolution kernel of 3 multiplied by 3, and the residual error module is internally provided with a short circuit connection for accelerating the network convergence.
The super-resolution module reduces the number of output channels by half through a decoder module, and simultaneously gradually restores the size of an image; the decoder module comprises a convolution layer with convolution kernel of 1 multiplied by 1 and step length of 1 and a deconvolution layer with convolution kernel of 3 multiplied by 3 and step length of 2, and is used for controlling the number of output channels and the rise of image resolution; finally obtaining a super-resolution image of the image through a continuous three-time deconvolution module;
the semantic segmentation module gradually restores the resolution of the image through the decoder module, and simultaneously cascades the decoder module and the image with the same resolution restored by the super-resolution as the input of the next decoder module, and finally realizes the semantic segmentation of the network through the two decoder modules;
the super-resolution domain distinguishing module consists of five convolution layers with convolution kernel size of 3 multiplied by 3 and step length of 2, a residual error feature pyramid attention module and a sigmoid activation layer, wherein feature extraction of a high-resolution image is realized through the convolution layers, feature extraction and integration are performed on a network through the feature pyramid attention module, and finally a final super-resolution domain label feature map is obtained through the sigmoid activation layer;
the semantic segmentation domain distinguishing module consists of five convolution layers with convolution kernel size of 3 multiplied by 3 and step length of 2, a residual error feature pyramid attention module and a sigmoid activation layer, the feature extraction of a semantic segmentation image and a probability map is realized through the convolution layers, and finally, the final semantic segmentation domain label feature map is obtained through the sigmoid activation layer.
In the third step, the data set used in training the remote sensing image super-resolution network is low-resolution target domain data, initial high-resolution target domain data and low-resolution source domain data which are subjected to down-sampling;
in the third step, the loss function used in training the remote sensing image super-resolution network is a mean square loss function, and a calculation formula of the mean square loss function is as follows:
in the above formula: x is a super-resolution image generated by a super-resolution network, Y is a real high-resolution image, and N is the number of image pixel points;
the loss function of the super-resolution domain discrimination module in the third step is a mean square loss function, and the calculation formula of the mean square loss function is as follows:
Ldsr=ΕS[(Is-1)2]+ΕT[(It)2]
Ldsrinv=ΕS[(Is)2]+ΕT[(It-1)2];
in the above formula: l isdsrFor loss of the super-resolution domain discrimination module Dsr in training the generator, LdsrinvFor loss of the super-resolution domain discrimination module Dsr when training the discriminator network, IsSuper-resolution maps generated for source domain low resolution images, ItHigh resolution image generated for down-sampling a low resolution image in the target domain, ESTo expect the loss of all inputs belonging to the source domain S, ETThe loss of all the target domains T is expected;
in the third step, when a super-resolution network E-SR consisting of the characteristic extraction module E and the super-resolution module SR is trained, the loss function L is minimizedGTo optimize the parameter theta of the super-resolution network discriminator sectionE-SR;
When the super-resolution domain discrimination module Dsr is trained, the loss function L is minimizedDTo optimize the network parameter theta of the domain discriminator part of the super-resolution domain discrimination moduleDsr;
The super-resolution of the image is realized through the alternative confrontation training of a super-resolution network and a super-resolution domain discrimination module;
the loss function of the super-resolution network in the training process is as follows:
the loss function of the super-resolution domain discrimination module in the training process is as follows:
in the fourth step, a loss function used in training the remote sensing image semantic segmentation network is a Dice coefficient loss function and a cross entropy loss function which are jointly used as loss functions, wherein a calculation formula of the cross entropy loss function is as follows:
in the above formula: y is a real label graph, y' is a predicted label graph, and N is the number of pixel points of the image;
the calculation formula of the Dice coefficient loss function is as follows:
in the above formula: x is a generated prediction label probability graph, Y is a real label graph, | X |, N.Y | is an intersection between the real label graph and the prediction label graph, | X | is the number of elements of the prediction label graph, | Y | is the number of elements of the real label, and K is the category number of the label;
the loss function of the semantic partition domain judging module in the fourth step is a mean square loss function, and the calculation formula of the mean square loss function is as follows:
Lds=ΕS[(Ls)2]+ΕT[(Lt-1)2]
Ldsinv=ΕS[(Ls-1)2]+ΕT[(Lt)2];
in the above formula: l isdsFor loss of the semantic Domain discriminant Module Ds when training the generators, LdsinvFor loss of the semantic Domain partition discrimination Module Ds in training the discriminator, LsSemantic segmentation domain discrimination Module Label map, L, generated for Source Domain imagestSemantic segmentation domain discrimination Module tag map generated for target Domain images, ESTo expect the loss of all inputs belonging to the source domain S, ETThe loss of all the target domains T is expected;
in the fourth step, when a semantic segmentation network E-SR-S consisting of the feature extraction module E, the semantic segmentation module S and the super-resolution module SR is trained, a loss function L is minimizedGTo optimize the parameter theta of a semantic segmentation networkE-SR-SWherein the loss function LGThe sum of the cross entropy loss function, the Dice coefficient loss function, the super resolution loss and the loss of a super resolution domain discrimination module and a semantic segmentation domain discrimination module is obtained;
when training the super-resolution domain discrimination module Dsr and the semantic segmentation domain discrimination module Ds, the loss function L is minimizedDTo optimize the network parameters theta of the two domain discrimination modulesDsr;
The semantic segmentation of the image is realized through the alternative countermeasure training of a semantic segmentation network, a semantic segmentation domain judging module Dsr and a super-resolution domain judging module Ds;
the loss function of the semantic segmentation network in the training process is as follows:
loss functions of the super-resolution domain discrimination module and the semantic segmentation domain discrimination module in the training process are as follows:
compared with the prior art, the invention has the beneficial effects that:
1) the method of the invention uses a residual error feature pyramid attention module in a feature coding module, and ensures that more global information is extracted by extracting image features under different resolutions. The structure can adopt convolution kernels of different receptive fields to achieve feature acquisition of targets of different sizes, and can be combined with residual connection to avoid explosion and disappearance of gradients.
2) The problem of unbalanced variety in semantic segmentation of the high-resolution remote sensing image can be effectively solved.
3) The method adopts the characteristics of jump connection transmission from the super-resolution module when the image semantic segmentation network is built. In the image processing, the feature map transmitted from the super-resolution process through jump connection not only contains the position, edge and other detailed features of the target, but also contains a large amount of high-level semantic information; the method has good segmentation effect and strong robustness.
Drawings
The invention is further described below with reference to the accompanying drawings:
FIG. 1 is a flow chart of an embodiment of an image semantic segmentation network constructed in the method of the present invention;
FIG. 2 is a schematic diagram of a structure of an image semantic segmentation network constructed in the method of the present invention;
FIG. 3 is a schematic diagram of a super-resolution network in an image semantic segmentation network constructed in the method of the present invention;
FIG. 4 is a schematic diagram of a component structure of a feature coding module in an image semantic segmentation network constructed in the method of the present invention;
FIG. 5 is a schematic diagram of a structure of a residual feature pyramid attention module in the image semantic segmentation network constructed in the method of the present invention;
FIG. 6 is a schematic diagram of a structure of a feature pyramid attention module in an image semantic segmentation network constructed in the method of the present invention;
FIG. 7 is a schematic diagram of a structure of a residual error module in an image semantic segmentation network constructed by the method of the present invention;
FIG. 8 is a schematic diagram of a super-resolution domain discrimination module in the image semantic segmentation network constructed in the method of the present invention;
FIG. 9 is a schematic diagram of the structure of the semantic segmentation domain discrimination module in the image semantic segmentation network constructed by the method of the present invention.
Detailed Description
As shown in fig. 1 to 9, in the unsupervised remote sensing image semantic segmentation method based on super-resolution and domain adaptation, the feature pyramid attention module replaces the ASPP module in the original unsupervised remote sensing image semantic segmentation method based on super-resolution domain adaptation, and the residual feature pyramid attention module is applied to the discriminator to acquire accurate pixel level attention for high-level semantic features; the method for relieving the class imbalance problem through the Dice coefficient loss function comprises the following steps:
the method comprises the following steps: obtaining a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set, wherein the source domain remote sensing image data set and the target domain remote sensing image data set are obtained through a remote sensing satellite; the source domain data comprises a low-resolution original image and an artificially marked label data image, and the target domain data only comprises a high-resolution original image; dividing an acquired target domain image data set into a training image and a test image according to a certain proportion, forming a training set image of image semantic segmentation by using source domain data and a target domain training set image, and forming a test set image of image semantic segmentation by using a test set image of a target domain;
preprocessing the remote sensing image data of the training set to obtain a remote sensing image data set subjected to data enhancement;
step two: building a semantic segmentation network of the remote sensing image: the remote sensing image semantic segmentation network comprises a feature coding module, a super-resolution domain distinguishing module, a semantic segmentation module and a semantic segmentation domain distinguishing module, and the construction steps comprise:
step 2.1: and inputting the image into a characteristic coding module to obtain multi-scale and multi-level image characteristics. The feature coding module realizes the hierarchical feature extraction of the image from the bottom level detail feature to the high level semantic feature through convolution and maximum pooling operation. Performing convolution and maximum pooling operation on the image for three times to extract bottom-layer features of the network, extracting multi-scale features of the image through a residual feature pyramid attention module, fusing the extracted multi-scale features of the image through a residual network module, and repeating the extraction and fusion of the multi-scale features for two times to finally obtain rich image features with multiple scales and multiple levels;
step 2.2: and 2.1, obtaining a high-resolution image with the enlarged size of the original image by the image features extracted by the feature coding module in the step 2.1 through the super-resolution module, and inputting the image generated by the super-resolution module into the super-resolution domain distinguishing module to realize the distinguishing of the domain to which the input image belongs so as to optimize the parameters of the super-resolution module and the feature coding module. The feature coding module and the super-resolution module jointly form a super-resolution network of the remote sensing image, and the super-resolution network is combined with the super-resolution domain distinguishing module to jointly realize the super-resolution of the low-resolution image;
step 2.3: and (3) taking the features extracted by the feature coding module and the super-resolution module trained in the step 2.2 as the input of the semantic segmentation module. The features extracted by the feature coding module and part of the features in the image super-resolution module are input into the semantic segmentation module of the image together, so that the semantic segmentation of the remote sensing image data is realized. The feature extraction module, the super-resolution module and the semantic segmentation module jointly form a semantic segmentation network of the image. Splicing the high-resolution remote sensing image and the probability map generated by the semantic segmentation module, inputting the spliced probability map into the semantic segmentation domain discrimination module to realize optimization of the image semantic segmentation network, and finally realizing the segmentation function of the remote sensing image
Step three: pre-training and parameter optimization of the super-resolution network:
step 3.1: carrying out parameter random initialization on the feature coding module and the super-resolution module, inputting the training set data preprocessed in the step one into the remote sensing image super-resolution network in the step two to generate a high-resolution image, and calculating super-resolution loss; and inputting the generated high-resolution image and the original high-resolution image into a super-resolution domain discrimination module to realize discrimination of the domain to which the input image belongs and calculate the discrimination loss of the super-resolution domain.
Step 3.2: loss reverse propagation is performed, parameters of the super-resolution network and the super-resolution domain discrimination module are alternately optimized, and finally super resolution of the low-resolution image is achieved.
Step 3.3: after training is finished, storing the parameters of the trained feature coding module, the super-resolution module and the super-resolution domain distinguishing module;
step four: training a semantic segmentation model of the remote sensing image:
step 4.1: initializing a feature coding module, a super-resolution module and a super-resolution domain distinguishing module in the semantic segmentation network by using the model parameters saved in the third step, simultaneously performing random initialization on the parameters of the semantic segmentation module and the semantic segmentation domain distinguishing module, inputting the training set data preprocessed in the first step into the remote sensing image semantic segmentation network in the second step, generating a semantic segmentation probability map of the remote sensing image, and calculating semantic segmentation loss; and splicing the high-resolution remote sensing image and the semantic segmentation probability map, and inputting the spliced high-resolution remote sensing image and the semantic segmentation probability map into a semantic segmentation domain discrimination module to realize discrimination of the domain to which the semantic segmentation network generation probability map belongs and calculate the discrimination loss of the semantic segmentation domain.
Step 4.2: and (4) loss back propagation, alternately optimizing parameters of the semantic segmentation network and the two domain discrimination modules, and finally finishing the optimization of the parameters of the semantic segmentation network by taking the minimization of a loss function as an optimization target.
Step 4.3: after training is finished, storing the trained semantic segmentation network model parameters;
step five: and inputting the processed test set data into the trained semantic segmentation network of the remote sensing image in the fourth step, and outputting an accurate segmentation result of the remote sensing image.
Preprocessing the remote sensing image training set data in the first step, wherein the preprocessing comprises image cutting, image sampling and data enhancement of original images in the training set;
the image clipping specifically comprises the following steps: cutting the source domain training set image and the label into an image with 256 pixels multiplied by 256 pixels and the resolution of 1 meter per pixel; cutting the target domain training and testing set image into an image with 512 pixels multiplied by 512 pixels and 0.5 meter per pixel resolution;
the image sampling specifically comprises: and upsampling the source domain image and the label to obtain a high-resolution image with 512 pixels multiplied by 512 pixels and the resolution of 0.5 meter per pixel. And downsampling the target domain image to obtain a low-resolution image with the resolution of 1 meter per pixel and the pixel of 256 pixels.
The data enhancement comprises: and carrying out image rotation, image vertical and horizontal overturning and image size adjustment on the images in the semantic segmentation training set of the remote sensing images.
The step 2.1 is as follows as the network structure of the feature coding module:
the first, second and third layers are convolution layers: performing convolution with convolution kernel size of 3 × 3 and step size of 1;
the fourth layer is a maximum pooling layer: the largest pooling layer with the step length of 2 is arranged behind the convolution layer;
the feature coding module is provided with two repeated residual feature pyramid attention modules and a residual module for realizing multi-scale feature fusion behind the maximum pooling layer; the residual error feature pyramid attention module consists of three continuous feature pyramid attention module networks containing residual error connection; the feature pyramid attention module is divided into two paths, the first path realizes network feature transfer by adopting global pooling, convolution with convolution kernel of 1 × 1 and an up-sampling layer for input features, the second path adopts a U-shaped network structure to realize multi-layer feature extraction, and convolution operation with step length of 2 is carried out on the features for three times to obtain feature graphs with different sizes of input feature sizes 1/2, 1/4 and 1/8, wherein the convolution kernels of the three times of convolution are 7 × 7, 5 × 5 and 3 × 3 respectively. Sampling the characteristic diagram with the size of 1/8, overlapping the characteristic diagram with the size of 1/4, repeating the steps twice, and finally multiplying the output characteristic diagram and the characteristic diagram which is subjected to 1 multiplied by 1 convolution pixel by pixel to obtain the characteristic diagram with the same size as the input characteristic diagram; and finally, overlapping the characteristic diagrams of the two paths to obtain a multi-scale characteristic diagram. And the combination of two convolution operations in the residual error module realizes the fusion of the characteristic channels by convolution with the step length of 1 and the convolution kernel of 3 multiplied by 3, and the residual error module is internally provided with a short circuit connection for accelerating the network convergence.
The super-resolution module reduces the number of output channels by half through the decoder module, and simultaneously gradually restores the size of the image. The decoder module comprises a convolution layer with convolution kernel of 1 multiplied by 1 and step length of 1 and a deconvolution layer with convolution kernel of 3 multiplied by 3 and step length of 2, and is used for controlling the number of output channels and the rise of image resolution; and finally obtaining a super-resolution image of the image through a continuous three-time deconvolution module.
The semantic segmentation module gradually restores the resolution of the image through the decoder module, and simultaneously cascades the decoder module and the image with the same resolution restored by the super-resolution as the input of the next decoder module, and finally realizes the semantic segmentation of the network through the two decoder modules.
The super-resolution domain distinguishing module comprises five convolution layers with the convolution kernel size of 3 multiplied by 3 and the step length of 2, a residual feature pyramid attention module and a sigmoid activation layer, wherein feature extraction of a high-resolution image is achieved through the convolution layers, then feature extraction and integration are conducted on a network through the residual feature pyramid attention module, and finally a final super-resolution domain label feature map is obtained through the sigmoid activation layer.
The semantic segmentation domain distinguishing module comprises five convolution layers with the convolution kernel size of 3 multiplied by 3 and the step length of 2, a residual error feature pyramid attention module and a sigmoid activation layer, the feature extraction of a semantic segmentation image and a probability map is realized through the convolution layers, then the feature extraction and integration are carried out on the network through the residual error feature pyramid attention module, and finally the final semantic segmentation domain label feature map is obtained through the sigmoid activation layer.
In the third step, the data set used in training the remote sensing image super-resolution network is low-resolution target domain data, initial high-resolution target domain data and low-resolution source domain data which are subjected to down-sampling, and the loss function is a mean square loss function as a loss function, wherein the calculation formula of the mean square loss function is as follows:
in the above formula: x is a super-resolution image generated by a super-resolution network, Y is a real high-resolution image, and N is the number of image pixel points;
the loss function of the super-resolution domain discrimination module in the third step is a mean square loss function, and the calculation formula of the mean square loss function is
Ldsr=ΕS[(Is-1)2]+ΕT[(It)2](2)
Ldsrinv=ΕS[(Is)2]+ΕT[(It-1)2](3)
In the above formula: i issSuper-resolution maps generated for source domain low resolution images, ItA high resolution image generated for down-sampling the low resolution image in the target domain; l isdsrLoss of a super-resolution domain discrimination module Dsr when a generator is trained; l isdsrinvLoss of a super-resolution domain discrimination module Dsr when a discriminator network is trained;
and the countermeasure training process of the super-resolution network in the third step is to alternately optimize the super-resolution network and the super-resolution domain self-adaptive module. When training the super-resolution network E-SR formed by the characteristic extraction module E and the super-resolution module SR, the loss function L is minimizedGOptimizing parameters of the super-resolution network; when the super-resolution domain discrimination module Dsr is trained, the loss function L is minimizedDTo optimize the network parameters of the domain discriminator section. And realizing the super-resolution of the image through the alternative confrontation training of the super-resolution network and the super-resolution domain discriminator module. The loss function of the network during training is:
in the fourth step, a loss function used in training the remote sensing image semantic segmentation network is a Dice coefficient loss function and a cross entropy loss function which are jointly used as loss functions, wherein a calculation formula of the cross entropy loss function is as follows:
in the above formula: y is a real label graph, y' is a predicted label graph, N is the number of pixel points of the image, and K;
the calculation formula of the Dice coefficient loss function is as follows:
in the above formula: x is a generated prediction label probability graph, Y is a real label graph, | X |, N.Y | is an intersection between the real label graph and the prediction label graph, | X | is the number of elements of the prediction label graph, | Y | is the number of elements of the real label, and K is the category number of the label;
the loss function of the semantic segmentation domain discrimination module in the fourth step is a mean square loss function, and the calculation formula of the mean square loss function is
Lds=ΕS[(Ls)2]+ΕT[(Lt-1)2] (8)
Ldsinv=ΕS[(Ls-1)2]+ΕT[(Lt)2] (9)
In the above formula: l issSemantic segmentation domain discrimination Module Label map, L, generated for Source Domain imagestGenerating a semantic segmentation domain discrimination module label graph for the target domain image; l isdsLoss of a semantic segmentation domain discrimination module Ds when training a generator; l isdsinvFor trainingLoss of a semantic division domain discrimination module Ds during discriminant training;
the countermeasure training process of the semantic segmentation network in the fourth step is alternately optimizing the semantic segmentation network, the super-resolution domain judging module and the semantic segmentation domain judging module. When training a semantic segmentation network E-SR-S consisting of a feature extraction module E, a semantic segmentation module S and a super-resolution module SR, a loss function L is minimizedGTo optimize parameters of a semantically segmented network, wherein a loss function LGThe sum of the cross entropy loss function, the Dice coefficient loss function, the super-resolution loss and the loss of the two domain discrimination modules is obtained; when training the super-resolution domain discrimination module Dsr and the semantic segmentation domain discrimination module Ds, the loss function L is minimizedDNetwork parameters of the two domain discrimination modules are optimized. And the semantic segmentation network E-SR-S, the semantic segmentation domain discriminator module Dsr and the super-resolution domain discrimination module Ds alternately resist training to realize the semantic segmentation of the image. The loss function of the network during training is:
dividing remote sensing image data sets under different resolutions into a training set and a test set according to a certain proportion; constructing a depth remote sensing image semantic segmentation network combining super-resolution and domain self-adaptation: the remote sensing image semantic segmentation network comprises a feature coding module, a super-resolution domain distinguishing module, a semantic segmentation module and a semantic segmentation domain distinguishing module; inputting the preprocessed training set data into a remote sensing image semantic segmentation network, training the remote sensing image super-resolution network and the image semantic segmentation network in a segmented manner, and storing network parameters; and inputting the test set data into the trained remote sensing image semantic segmentation network, and outputting the segmentation result of the test image data.
The invention discloses a semantic segmentation method for an unsupervised remote sensing image by combining super-resolution and domain self-adaptation. Dividing the image into a test set and a training set, and preprocessing the image in the training set; then, a remote sensing image semantic segmentation network based on a deep learning network is constructed, a training set image segmentation training remote sensing super-resolution network and the semantic segmentation network are input, and model parameters are stored when the network converges; and finally, obtaining a final prediction result image from the test set image through an image semantic segmentation network. Compared with the prior art, the method realizes semantic segmentation of the remote sensing image by adding the super-resolution module and the multi-size feature extraction module. The method has the advantages of good segmentation effect and strong robustness.
It should be noted that, regarding the specific structure of the present invention, the connection relationship between the modules adopted in the present invention is determined and can be realized, except for the specific description in the embodiment, the specific connection relationship can bring the corresponding technical effect, and the technical problem proposed by the present invention is solved on the premise of not depending on the execution of the corresponding software program.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (10)
1. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation is characterized by comprising the following steps of: the method comprises the following steps:
the method comprises the following steps: acquiring a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set, and dividing the acquired target domain image data set into a training image and a test image according to a set proportion;
forming a training set of image semantic segmentation by using the source domain image data set and the target domain training image, and forming a test set of image semantic segmentation by using the test image of the target domain;
preprocessing the remote sensing image data of the training set to obtain a remote sensing image data set subjected to data enhancement;
step two: building a remote sensing image semantic segmentation network, wherein the super-resolution remote sensing image semantic segmentation network comprises a feature coding module, a super-resolution domain distinguishing module, a semantic segmentation module and a semantic segmentation domain distinguishing module, and the feature coding module and the super-resolution module jointly form the super-resolution network of the remote sensing image;
step three: performing network pre-training and parameter optimization on the super-resolution network built in the step two;
step four: training a semantic segmentation network of the remote sensing image;
step five: inputting the preprocessed test set data into the trained semantic segmentation network of the remote sensing image in the fourth step, and outputting an accurate segmentation result of the remote sensing image;
the feature coding module is provided with two repeated residual feature pyramid attention modules and a residual module for realizing multi-scale feature fusion behind the maximum pooling layer;
the super-resolution domain distinguishing module is provided with a residual characteristic pyramid attention module;
the semantic division domain distinguishing module is provided with a residual error characteristic pyramid attention module;
the residual error feature pyramid attention module is composed of three continuous feature pyramid attention module networks containing residual error connection, the feature pyramid attention module is divided into two paths, the first path adopts global pooling for input features, convolution with convolution kernel of 1 × 1 and an upper sampling layer to achieve feature transfer of the network, the second path adopts a U-shaped network structure to achieve multi-layer feature extraction, convolution operation with feature step length of 2 is conducted three times to obtain feature maps with different sizes of input feature sizes 1/2, 1/4 and 1/8, the convolution kernels of the convolution for three times are respectively 7 × 7, 5 × 5 and 3 × 3, then the feature map with size of 1/8 is sampled and overlapped with the feature map with size of 1/4, the steps are repeated twice, and finally the output feature map and the feature map which is subjected to convolution with size of 1 × 1 are multiplied pixel by pixel to obtain the feature map with the same size as the input feature map Figure representation; finally, overlapping the characteristic diagrams of the two paths to obtain a multi-scale characteristic diagram;
and the two convolution operations in the residual error module realize the characteristic channel fusion by the convolution with the step length of 1 and the convolution kernel of 3 multiplied by 3, and the residual error module is internally provided with a short circuit connection for accelerating the network convergence.
2. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: and acquiring a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set in the first step through a remote sensing satellite, wherein the source domain low-resolution remote sensing image data set comprises a low-resolution original image and a label data image which is artificially marked, and the target domain high-resolution remote sensing image data set comprises a high-resolution original image.
3. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: preprocessing the source domain image data set and the target domain training set image remote sensing image data in the first step specifically comprises image cutting, image sampling and data enhancement of original images in a training set;
the image clipping specifically comprises the following steps: cutting the original image of the source domain and the label data image into images with 256 pixels multiplied by 256 pixels and the resolution of 1 meter per pixel; cutting the target domain training set and the test set image into images with 512 pixels multiplied by 512 pixels and 0.5 meter per pixel resolution;
the image sampling specifically comprises: up-sampling the original image of the source domain and the label data image to obtain a high-resolution image with 512 pixels multiplied by 512 pixels and the resolution of 0.5 meter per pixel; down-sampling the target domain image to obtain a low-resolution image with 256 pixels multiplied by 256 pixels and the resolution of 1 meter per pixel;
the data enhancement comprises: and carrying out image rotation, image vertical and horizontal overturning and image size adjustment on the images in the semantic segmentation training set of the remote sensing images.
4. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: the second step of building the super-resolution remote sensing image semantic segmentation network comprises the following steps:
step 2.1: inputting the image into a feature coding module to obtain multi-scale and multi-level image features: the feature coding module realizes the hierarchical feature extraction of the image from the bottom level detail feature to the high level semantic feature through convolution and maximum pooling operation, and specifically comprises the following steps: performing convolution and maximum pooling operation on the image for three times to extract bottom-layer features of the network, extracting multi-scale features of the image through a residual feature pyramid attention module, fusing the extracted multi-scale features of the image through a residual network module, and repeating the extraction and fusion of the multi-scale features for two times to finally obtain rich image features with multiple scales and multiple levels;
step 2.2: 2.1, the image features extracted by the feature coding module in the step 2.1 are processed by the super-resolution module to obtain a high-resolution image with the enlarged size of the original image, then the image generated by the super-resolution module is input into the super-resolution domain judging module, and the domain to which the input image belongs is judged to optimize the parameters of the super-resolution module and the feature coding module;
the feature coding module and the super-resolution module jointly form a super-resolution network of the remote sensing image, and the super-resolution network is combined with the super-resolution domain distinguishing module to jointly realize the super-resolution of the low-resolution image;
step 2.3: the features extracted by the feature coding module and the super-resolution module trained in the step 2.2 are taken as the input of the semantic segmentation module together: the features extracted by the feature coding module and part of the features in the image super-resolution module are input into a semantic segmentation module of the image together to realize semantic segmentation of the remote sensing image data;
the system comprises a feature extraction module, a super-resolution module, a semantic segmentation module and a semantic segmentation module, wherein the feature extraction module, the super-resolution module and the semantic segmentation module jointly form a semantic segmentation network of an image, a high-resolution remote sensing image and a probability map generated by the semantic segmentation module are spliced and then input into the semantic segmentation domain discrimination module to optimize the semantic segmentation network of the image, and finally, the segmentation function of the remote sensing image is realized.
5. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: the third step specifically comprises:
step 3.1: carrying out parameter random initialization on the feature coding module and the super-resolution module, inputting the training set data preprocessed in the step one into the remote sensing image super-resolution network in the step two to generate a high-resolution image, and calculating super-resolution loss;
inputting the generated high-resolution image and the original high-resolution image into a super-resolution domain discrimination module, discriminating the domain of the input image, and calculating the discrimination loss of the super-resolution domain;
step 3.2: loss reverse propagation is performed, parameters of a super-resolution network and a super-resolution domain discrimination module are alternately optimized, and super resolution of the low-resolution image is finally achieved;
step 3.3: and after the training is finished, storing the parameters of the trained feature coding module, the super-resolution module and the super-resolution domain distinguishing module.
6. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: the fourth step specifically comprises:
step 4.1: initializing a feature coding module, a super-resolution module and a super-resolution domain distinguishing module in the semantic segmentation network by using the model parameters saved in the third step, simultaneously performing random initialization on the parameters of the semantic segmentation module and the semantic segmentation domain distinguishing module, inputting the training set data preprocessed in the first step into the remote sensing image semantic segmentation network in the second step, generating a semantic segmentation probability map of the remote sensing image, and calculating semantic segmentation loss;
splicing the high-resolution remote sensing image and the semantic segmentation probability map, and inputting the spliced high-resolution remote sensing image and the semantic segmentation probability map into a semantic segmentation domain discrimination module to realize discrimination of a domain to which the semantic segmentation network generation probability map belongs and calculate discrimination loss of the semantic segmentation domain;
step 4.2: loss back propagation, namely alternately optimizing parameters of the semantic segmentation network and the two domain discrimination modules, and finally finishing the optimization of the parameters of the semantic segmentation network by taking the minimization of a loss function as an optimization target;
step 4.3: and after the training is finished, storing the trained semantic segmentation network model parameters.
7. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain adaptation according to claim 4, characterized in that: the network structure of the feature coding module in step 2.1 is as follows:
the first, second and third layers are convolution layers: performing convolution with convolution kernel size of 3 × 3 and step size of 1;
the fourth layer is a maximum pooling layer: the convolutional layer is followed by the largest pooling layer with step size 2.
8. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that:
the super-resolution module reduces the number of output channels by half through a decoder module, and simultaneously gradually restores the size of an image; the decoder module comprises a convolution layer with convolution kernel of 1 multiplied by 1 and step length of 1 and a deconvolution layer with convolution kernel of 3 multiplied by 3 and step length of 2, and is used for controlling the number of output channels and the rise of image resolution; finally obtaining a super-resolution image of the image through a continuous three-time deconvolution module;
the semantic segmentation module gradually restores the resolution of the image through the decoder module, and simultaneously cascades the decoder module and the image with the same resolution restored by the super-resolution as the input of the next decoder module, and finally realizes the semantic segmentation of the network through the two decoder modules;
the super-resolution domain distinguishing module consists of five convolution layers with convolution kernel size of 3 multiplied by 3 and step length of 2, a residual error feature pyramid attention module and a sigmoid activation layer, wherein feature extraction of a high-resolution image is realized through the convolution layers, feature extraction and integration are performed on a network through the feature pyramid attention module, and finally a final super-resolution domain label feature map is obtained through the sigmoid activation layer;
the semantic segmentation domain distinguishing module consists of five convolution layers with convolution kernel size of 3 multiplied by 3 and step length of 2, a residual error feature pyramid attention module and a sigmoid activation layer, the feature extraction of a semantic segmentation image and a probability map is realized through the convolution layers, and finally, the final semantic segmentation domain label feature map is obtained through the sigmoid activation layer.
9. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: in the third step, the data set used in training the remote sensing image super-resolution network is low-resolution target domain data, initial high-resolution target domain data and low-resolution source domain data which are subjected to down-sampling;
in the third step, the loss function used in training the remote sensing image super-resolution network is a mean square loss function, and a calculation formula of the mean square loss function is as follows:
in the above formula: x is a super-resolution image generated by a super-resolution network, Y is a real high-resolution image, and N is the number of image pixel points;
the loss function of the super-resolution domain discrimination module in the third step is a mean square loss function, and the calculation formula of the mean square loss function is as follows:
Ldsr=ΕS[(Is-1)2]+ΕT[(It)2]
Ldsrinv=ΕS[(Is)2]+ΕT[(It-1)2];
in the above formula: l isdsrFor loss of the super-resolution domain discrimination module Dsr in training the generator, LdsrinvFor loss of the super-resolution domain discrimination module Dsr when training the discriminator network, IsSuper-resolution maps generated for source domain low resolution images, ItHigh resolution image generated for down-sampling a low resolution image in the target domain, ESTo expect the loss of all inputs belonging to the source domain S, ETThe loss of all the target domains T is expected;
in the third step, when a super-resolution network E-SR consisting of the characteristic extraction module E and the super-resolution module SR is trained, the loss function L is minimizedGTo optimize the parameter theta of the super-resolution network discriminator sectionE-SR;
When the super-resolution domain discrimination module Dsr is trained, the loss function L is minimizedDTo optimize the network parameter theta of the domain discriminator part of the super-resolution domain discrimination moduleDsr;
The super-resolution of the image is realized through the alternative confrontation training of a super-resolution network and a super-resolution domain discrimination module;
the loss function of the super-resolution network in the training process is as follows:
the loss function of the super-resolution domain discrimination module in the training process is as follows:
10. the unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: in the fourth step, a loss function used in training the remote sensing image semantic segmentation network is a Dice coefficient loss function and a cross entropy loss function which are jointly used as loss functions, wherein a calculation formula of the cross entropy loss function is as follows:
in the above formula: y is a real label graph, y' is a predicted label graph, and N is the number of pixel points of the image;
the calculation formula of the Dice coefficient loss function is as follows:
in the above formula: x is a generated prediction label probability graph, Y is a real label graph, | X |, N.Y | is an intersection between the real label graph and the prediction label graph, | X | is the number of elements of the prediction label graph, | Y | is the number of elements of the real label, and K is the category number of the label;
the loss function of the semantic partition domain judging module in the fourth step is a mean square loss function, and the calculation formula of the mean square loss function is as follows:
Lds=ΕS[(Ls)2]+ΕT[(Lt-1)2]
Ldsinv=ΕS[(Ls-1)2]+ΕT[(Lt)2];
in the above formula: l isdsFor loss of the semantic Domain discriminant Module Ds when training the generators, LdsinvFor loss of the semantic Domain partition discrimination Module Ds in training the discriminator, LsSemantic segmentation domain discrimination Module Label map, L, generated for Source Domain imagestSemantic segmentation domain discrimination Module tag map generated for target Domain images, ESTo expect the loss of all inputs belonging to the source domain S, ETThe loss of all the target domains T is expected;
in the fourth step, when a semantic segmentation network E-SR-S consisting of the feature extraction module E, the semantic segmentation module S and the super-resolution module SR is trained, a loss function L is minimizedGTo optimize the parameter theta of a semantic segmentation networkE-SR-SWherein the loss function LGThe sum of the cross entropy loss function, the Dice coefficient loss function, the super resolution loss and the loss of a super resolution domain discrimination module and a semantic segmentation domain discrimination module is obtained;
when training the super-resolution domain discrimination module Dsr and the semantic segmentation domain discrimination module Ds, the loss function L is minimizedDTo optimize the network parameters theta of the two domain discrimination modulesDsr;
The semantic segmentation of the image is realized through the alternative countermeasure training of a semantic segmentation network, a semantic segmentation domain judging module Dsr and a super-resolution domain judging module Ds;
the loss function of the semantic segmentation network in the training process is as follows:
loss functions of the super-resolution domain discrimination module and the semantic segmentation domain discrimination module in the training process are as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110530385.2A CN113160234B (en) | 2021-05-14 | 2021-05-14 | Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110530385.2A CN113160234B (en) | 2021-05-14 | 2021-05-14 | Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113160234A CN113160234A (en) | 2021-07-23 |
CN113160234B true CN113160234B (en) | 2021-12-14 |
Family
ID=76876131
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110530385.2A Active CN113160234B (en) | 2021-05-14 | 2021-05-14 | Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113160234B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113569724B (en) * | 2021-07-27 | 2022-04-19 | 中国科学院地理科学与资源研究所 | Road extraction method and system based on attention mechanism and dilation convolution |
CN113807356B (en) * | 2021-07-29 | 2023-07-25 | 北京工商大学 | End-to-end low-visibility image semantic segmentation method |
CN113610807B (en) * | 2021-08-09 | 2024-02-09 | 西安电子科技大学 | New coronaries pneumonia segmentation method based on weak supervision multitask learning |
CN113888406B (en) * | 2021-08-24 | 2024-04-23 | 厦门仟易网络科技有限公司 | Camera super-resolution method through deep learning |
CN113592745B (en) * | 2021-09-08 | 2023-11-28 | 辽宁师范大学 | Unsupervised MRI image restoration method based on antagonism domain self-adaption |
CN113496221B (en) * | 2021-09-08 | 2022-02-01 | 湖南大学 | Point supervision remote sensing image semantic segmentation method and system based on depth bilateral filtering |
CN113850813B (en) * | 2021-09-16 | 2024-05-28 | 太原理工大学 | Spatial resolution domain self-adaption based unsupervised remote sensing image semantic segmentation method |
CN113888547A (en) * | 2021-09-27 | 2022-01-04 | 太原理工大学 | Non-supervision domain self-adaptive remote sensing road semantic segmentation method based on GAN network |
CN114005043B (en) * | 2021-10-29 | 2024-04-05 | 武汉大学 | Small sample city remote sensing image information extraction method based on domain conversion and pseudo tag |
CN115311138B (en) * | 2022-07-06 | 2023-06-23 | 北京科技大学 | Image super-resolution method and device |
CN117911705B (en) * | 2024-03-19 | 2024-05-28 | 成都理工大学 | Brain MRI (magnetic resonance imaging) tumor segmentation method based on GAN-UNet variant network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108875595A (en) * | 2018-05-29 | 2018-11-23 | 重庆大学 | A kind of Driving Scene object detection method merged based on deep learning and multilayer feature |
CN110097129A (en) * | 2019-05-05 | 2019-08-06 | 西安电子科技大学 | Remote sensing target detection method based on profile wave grouping feature pyramid convolution |
CN110136062A (en) * | 2019-05-10 | 2019-08-16 | 武汉大学 | A kind of super resolution ratio reconstruction method of combination semantic segmentation |
CN111127493A (en) * | 2019-11-12 | 2020-05-08 | 中国矿业大学 | Remote sensing image semantic segmentation method based on attention multi-scale feature fusion |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9208539B2 (en) * | 2013-11-30 | 2015-12-08 | Sharp Laboratories Of America, Inc. | Image enhancement using semantic components |
CN108710830B (en) * | 2018-04-20 | 2020-08-28 | 浙江工商大学 | Human body 3D posture estimation method combining dense connection attention pyramid residual error network and isometric limitation |
CN112183258A (en) * | 2020-09-16 | 2021-01-05 | 太原理工大学 | Remote sensing image road segmentation method based on context information and attention mechanism |
-
2021
- 2021-05-14 CN CN202110530385.2A patent/CN113160234B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108875595A (en) * | 2018-05-29 | 2018-11-23 | 重庆大学 | A kind of Driving Scene object detection method merged based on deep learning and multilayer feature |
CN110097129A (en) * | 2019-05-05 | 2019-08-06 | 西安电子科技大学 | Remote sensing target detection method based on profile wave grouping feature pyramid convolution |
CN110136062A (en) * | 2019-05-10 | 2019-08-16 | 武汉大学 | A kind of super resolution ratio reconstruction method of combination semantic segmentation |
CN111127493A (en) * | 2019-11-12 | 2020-05-08 | 中国矿业大学 | Remote sensing image semantic segmentation method based on attention multi-scale feature fusion |
Non-Patent Citations (5)
Title |
---|
Fully convolutional DenseNet with adversarial training for semantic segmentation of high-resolution remote sensing images;Xuejun Guo 等;《Journal of Applied Remote Sensing》;20210331;第15卷(第1期);第016520-1至016520-12页 * |
Learning to Adapt Structured Output Space for Semantic Segmentation;Yi-Hsuan Tsai 等;《2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition》;20181217;第7472-7481页 * |
PCANet: Pyramid convolutional attention network for semantic segmentation;Haiwei Sang 等;《Image and Vision Computing》;20200807;第1-8页 * |
SRDA-Net: Super-Resolution Domain Adaptation Networks for Semantic Segmentation;Bin Pan 等;《https://arxiv.org/pdf/2005.06382.pdf》;20200521;第3页图2、第4页右栏第3段、第5页C-D部分、第6页左栏Algorithm 1、第IV节A部分、右栏B部分 * |
用于SAR遥感图像车辆型谱级识别的高阶特征表示多尺度残差卷积网络;陈禾 等;《信号处理》;20210331;第37卷(第3期);第317-327页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113160234A (en) | 2021-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113160234B (en) | Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation | |
CN111047551B (en) | Remote sensing image change detection method and system based on U-net improved algorithm | |
CN110705457B (en) | Remote sensing image building change detection method | |
CN113298818B (en) | Remote sensing image building segmentation method based on attention mechanism and multi-scale features | |
CN109934200B (en) | RGB color remote sensing image cloud detection method and system based on improved M-Net | |
CN112966684B (en) | Cooperative learning character recognition method under attention mechanism | |
CN113850825A (en) | Remote sensing image road segmentation method based on context information and multi-scale feature fusion | |
CN115601549B (en) | River and lake remote sensing image segmentation method based on deformable convolution and self-attention model | |
CN109087375B (en) | Deep learning-based image cavity filling method | |
CN113505792B (en) | Multi-scale semantic segmentation method and model for unbalanced remote sensing image | |
CN113888547A (en) | Non-supervision domain self-adaptive remote sensing road semantic segmentation method based on GAN network | |
CN111178304B (en) | High-resolution remote sensing image pixel level interpretation method based on full convolution neural network | |
CN113850813A (en) | Unsupervised remote sensing image semantic segmentation method based on spatial resolution domain self-adaption | |
Mu et al. | A climate downscaling deep learning model considering the multiscale spatial correlations and chaos of meteorological events | |
CN114092824A (en) | Remote sensing image road segmentation method combining intensive attention and parallel up-sampling | |
CN113628180B (en) | Remote sensing building detection method and system based on semantic segmentation network | |
CN113313180B (en) | Remote sensing image semantic segmentation method based on deep confrontation learning | |
CN112766381B (en) | Attribute-guided SAR image generation method under limited sample | |
CN112818777B (en) | Remote sensing image target detection method based on dense connection and feature enhancement | |
CN117809200A (en) | Multi-scale remote sensing image target detection method based on enhanced small target feature extraction | |
CN113111740A (en) | Characteristic weaving method for remote sensing image target detection | |
CN115082778B (en) | Multi-branch learning-based homestead identification method and system | |
CN115909077A (en) | Hyperspectral image change detection method based on unsupervised spectrum unmixing neural network | |
CN116012702A (en) | Remote sensing image scene level change detection method | |
CN116152263A (en) | CM-MLP network-based medical image segmentation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20221221 Address after: No. 47, Xutan East Street, Taiyuan, Shanxi 030031 Patentee after: Shanxi corps of China Building Materials Industry Geological Exploration Center Address before: 030024 No. 79 West Main Street, Taiyuan, Shanxi, Yingze Patentee before: Taiyuan University of Technology |
|
TR01 | Transfer of patent right |