CN113160234A - Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation - Google Patents

Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation Download PDF

Info

Publication number
CN113160234A
CN113160234A CN202110530385.2A CN202110530385A CN113160234A CN 113160234 A CN113160234 A CN 113160234A CN 202110530385 A CN202110530385 A CN 202110530385A CN 113160234 A CN113160234 A CN 113160234A
Authority
CN
China
Prior art keywords
resolution
module
image
super
domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110530385.2A
Other languages
Chinese (zh)
Other versions
CN113160234B (en
Inventor
郭学俊
陈泽华
杨佳林
刘晓峰
赵哲峰
杨莹
张佳鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi Corps Of China Building Materials Industry Geological Exploration Center
Original Assignee
Taiyuan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiyuan University of Technology filed Critical Taiyuan University of Technology
Priority to CN202110530385.2A priority Critical patent/CN113160234B/en
Publication of CN113160234A publication Critical patent/CN113160234A/en
Application granted granted Critical
Publication of CN113160234B publication Critical patent/CN113160234B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Abstract

The invention relates to a non-supervision remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation, belonging to the technical field of remote sensing image semantic segmentation methods; the technical problem to be solved is as follows: the improvement of the unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation is provided; the technical scheme for solving the technical problems is as follows: the method comprises the following steps: acquiring a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set, and dividing the acquired target domain image data set into a training image and a test image according to a set proportion; building a remote sensing image semantic segmentation network and a super-resolution network; performing network pre-training and parameter optimization on the built super-resolution network; training a semantic segmentation network of the remote sensing image; inputting the preprocessed test set data into a trained remote sensing image semantic segmentation network, and outputting an accurate segmentation result of the remote sensing image; the invention is applied to remote sensing image processing.

Description

Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation
Technical Field
The invention relates to an improved unsupervised high-resolution remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation, and belongs to the technical field of remote sensing image semantic segmentation methods.
Background
In recent years, with continuous progress and wide application of high-resolution earth observation technology, the spatial resolution of high-resolution remote sensing data is continuously improved and accumulated in a geometric progression, so that how to automatically, quickly and accurately extract high-value geographic information from a high-resolution remote sensing image becomes one of important problems which need to be solved urgently. Semantic segmentation marks each pixel in an image as a specific ground feature type, also called ground feature extraction or land classification, is one of important means for extracting information of a high-resolution remote sensing image, and is widely applied to the fields of land planning, environment monitoring, disaster assessment and the like.
The deep neural network can automatically extract semantic information of each grade from an image, and has strong feature expression capability, so that the deep neural network has great success in image semantic segmentation application at present. The excellent performance of these deep learning based semantic segmentation methods relies on remote sensing image labels labeled at the millions of pixel level. Because manual labeling of high-resolution remote sensing images is time-consuming, labor-consuming and requires abundant professional knowledge, the current semantic segmentation model in the field only depends on a small-scale training set acquired by a specific time period, a limited individual region and a specific remote sensing detector. This leads to these models and their limited generalization performance, and the segmentation accuracy is greatly reduced when the models are applied to different regions or probes, i.e. Domain Shifts (Domain Shifts). In order to solve the domain migration and fully utilize the existing data set, the unsupervised domain self-adaptive method realizes the semantic segmentation task of the unlabeled target domain data set by migrating the knowledge learned in the source domain data set, wherein the domain self-adaptive method adopting the countermeasure generation network learns the domain invariant features through the countermeasures of the generator and the discriminator, and the inter-domain difference can be effectively reduced.
The method is different from most of current high-Resolution remote sensing image Domain self-adaptive methods, only style difference, namely spectrum difference, between different detectors is considered, an unsupervised semantic segmentation method based on Super-Resolution Domain Adaptation (SRDA) notices that remote sensing images obtained by different detectors have difference in Resolution, meanwhile, different types of ground objects have different sizes in the remote sensing images, and the unsupervised Domain self-adaptive semantic segmentation of the remote sensing images is realized by utilizing the characteristics that the Super-Resolution and semantic segmentation multitask and multi-scale generation confrontation network learns spectrum and scale invariance at the same time. The model utilizes an porous Spatial Pyramid Pooling Module (atomic Spatial Pyramid Pooling Module) to extract multi-scale features or scale invariant features, but the void Convolution (scaled Convolution) is a sparse calculation and may cause Grid Artifacts (Grid Artifacts); while the spatial pyramid pooling module may cause pixel-level positioning information to be lost. In addition, semantic segmentation of high-resolution remote sensing images often faces a serious variety imbalance problem.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention aims to solve the technical problems that: the method provides an improvement of a semantic segmentation method of the unsupervised remote sensing image based on super-resolution and domain self-adaptation.
In order to solve the technical problems, the invention adopts the technical scheme that: the unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation comprises the following steps:
the method comprises the following steps: acquiring a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set, and dividing the acquired target domain image data set into a training image and a test image according to a set proportion;
forming a training set of image semantic segmentation by using the source domain image data set and the target domain training image, and forming a test set of image semantic segmentation by using the test image of the target domain;
preprocessing the remote sensing image data of the training set to obtain a remote sensing image data set subjected to data enhancement;
step two: building a remote sensing image semantic segmentation network, wherein the super-resolution remote sensing image semantic segmentation network comprises a feature coding module, a super-resolution domain distinguishing module, a semantic segmentation module and a semantic segmentation domain distinguishing module, and the feature coding module and the super-resolution module jointly form the super-resolution network of the remote sensing image;
step three: performing network pre-training and parameter optimization on the super-resolution network built in the step two;
step four: training a semantic segmentation network of the remote sensing image;
step five: and inputting the preprocessed test set data into the trained semantic segmentation network of the remote sensing image in the fourth step, and outputting an accurate segmentation result of the remote sensing image.
And acquiring a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set in the first step through a remote sensing satellite, wherein the source domain low-resolution remote sensing image data set comprises a low-resolution original image and a label data image which is artificially marked, and the target domain high-resolution remote sensing image data set comprises a high-resolution original image.
Preprocessing the source domain image data set and the target domain training set image remote sensing image data in the first step specifically comprises image cutting, image sampling and data enhancement of original images in a training set;
the image clipping specifically comprises the following steps: cutting the original image of the source domain and the label data image into images with 256 pixels multiplied by 256 pixels and the resolution of 1 meter per pixel; cutting the target domain training set and the test set image into images with 512 pixels multiplied by 512 pixels and 0.5 meter per pixel resolution;
the image sampling specifically comprises: up-sampling the original image of the source domain and the label data image to obtain a high-resolution image with 512 pixels multiplied by 512 pixels and the resolution of 0.5 meter per pixel; down-sampling the target domain image to obtain a low-resolution image with 256 pixels multiplied by 256 pixels and the resolution of 1 meter per pixel;
the data enhancement comprises: and carrying out image rotation, image vertical and horizontal overturning and image size adjustment on the images in the semantic segmentation training set of the remote sensing images.
The second step of building the super-resolution remote sensing image semantic segmentation network comprises the following steps:
step 2.1: inputting the image into a feature coding module to obtain multi-scale and multi-level image features: the feature coding module realizes the hierarchical feature extraction of the image from the bottom level detail feature to the high level semantic feature through convolution and maximum pooling operation, and specifically comprises the following steps: performing convolution and maximum pooling operation on the image for three times to extract bottom-layer features of the network, extracting multi-scale features of the image through a residual feature pyramid attention module, fusing the extracted multi-scale features of the image through a residual network module, and repeating the extraction and fusion of the multi-scale features for two times to finally obtain rich image features with multiple scales and multiple levels;
step 2.2: 2.1, the image features extracted by the feature coding module in the step 2.1 are processed by the super-resolution module to obtain a high-resolution image with the enlarged size of the original image, then the image generated by the super-resolution module is input into the super-resolution domain judging module, and the domain to which the input image belongs is judged to optimize the parameters of the super-resolution module and the feature coding module;
the feature coding module and the super-resolution module jointly form a super-resolution network of the remote sensing image, and the super-resolution network is combined with the super-resolution domain distinguishing module to jointly realize the super-resolution of the low-resolution image;
step 2.3: the features extracted by the feature coding module and the super-resolution module trained in the step 2.2 are taken as the input of the semantic segmentation module together: the features extracted by the feature coding module and part of the features in the image super-resolution module are input into a semantic segmentation module of the image together to realize semantic segmentation of the remote sensing image data;
the system comprises a feature extraction module, a super-resolution module, a semantic segmentation module and a semantic segmentation module, wherein the feature extraction module, the super-resolution module and the semantic segmentation module jointly form a semantic segmentation network of an image, a high-resolution remote sensing image and a probability map generated by the semantic segmentation module are spliced and then input into the semantic segmentation domain discrimination module to optimize the semantic segmentation network of the image, and finally, the segmentation function of the remote sensing image is realized.
The third step specifically comprises:
step 3.1: carrying out parameter random initialization on the feature coding module and the super-resolution module, inputting the training set data preprocessed in the step one into the remote sensing image super-resolution network in the step two to generate a high-resolution image, and calculating super-resolution loss;
inputting the generated high-resolution image and the original high-resolution image into a super-resolution domain discrimination module, discriminating the domain of the input image, and calculating the discrimination loss of the super-resolution domain;
step 3.2: loss reverse propagation is performed, parameters of a super-resolution network and a super-resolution domain discrimination module are alternately optimized, and super resolution of the low-resolution image is finally achieved;
step 3.3: and after the training is finished, storing the parameters of the trained feature coding module, the super-resolution module and the super-resolution domain distinguishing module.
The fourth step specifically comprises:
step 4.1: initializing a feature coding module, a super-resolution module and a super-resolution domain distinguishing module in the semantic segmentation network by using the model parameters saved in the third step, simultaneously performing random initialization on the parameters of the semantic segmentation module and the semantic segmentation domain distinguishing module, inputting the training set data preprocessed in the first step into the remote sensing image semantic segmentation network in the second step, generating a semantic segmentation probability map of the remote sensing image, and calculating semantic segmentation loss;
splicing the high-resolution remote sensing image and the semantic segmentation probability map, and inputting the spliced high-resolution remote sensing image and the semantic segmentation probability map into a semantic segmentation domain discrimination module to realize discrimination of a domain to which the semantic segmentation network generation probability map belongs and calculate discrimination loss of the semantic segmentation domain;
step 4.2: loss back propagation, namely alternately optimizing parameters of the semantic segmentation network and the two domain discrimination modules, and finally finishing the optimization of the parameters of the semantic segmentation network by taking the minimization of a loss function as an optimization target;
step 4.3: and after the training is finished, storing the trained semantic segmentation network model parameters.
The network structure of the feature coding module in step 2.1 is as follows:
the first, second and third layers are convolution layers: performing convolution with convolution kernel size of 3 × 3 and step size of 1;
the fourth layer is a maximum pooling layer: the largest pooling layer with the step length of 2 is arranged behind the convolution layer;
the feature coding module is provided with two repeated residual feature pyramid attention modules and a residual module for realizing multi-scale feature fusion behind the maximum pooling layer;
the residual error feature pyramid attention module is composed of three continuous feature pyramid attention module networks containing residual error connection, the feature pyramid attention module is divided into two paths, the first path adopts global pooling for input features, convolution with convolution kernel of 1 × 1 and an upper sampling layer to achieve feature transfer of the network, the second path adopts a U-shaped network structure to achieve multi-layer feature extraction, convolution operation with step length of 2 is conducted on the features three times to obtain feature maps with different sizes of input feature sizes 1/2, 1/4 and 1/8, the convolution kernels of the convolution three times are respectively 7 × 7, 5 × 5 and 3 × 3 in size, then the feature map with size of 1/8 is sampled and overlapped with the feature map with size of 1/4, the steps are repeated twice, and finally the output feature map and the feature map which is subjected to convolution with size of 1 × 1 are multiplied pixel by pixel to obtain the feature map with the same size as the input feature map Figure representation; finally, overlapping the characteristic diagrams of the two paths to obtain a multi-scale characteristic diagram;
and the two convolution operations in the residual error module realize the characteristic channel fusion by the convolution with the step length of 1 and the convolution kernel of 3 multiplied by 3, and the residual error module is internally provided with a short circuit connection for accelerating the network convergence.
The super-resolution module reduces the number of output channels by half through a decoder module, and simultaneously gradually restores the size of an image; the decoder module comprises a convolution layer with convolution kernel of 1 multiplied by 1 and step length of 1 and a deconvolution layer with convolution kernel of 3 multiplied by 3 and step length of 2, and is used for controlling the number of output channels and the rise of image resolution; finally obtaining a super-resolution image of the image through a continuous three-time deconvolution module;
the semantic segmentation module gradually restores the resolution of the image through the decoder module, and simultaneously cascades the decoder module and the image with the same resolution restored by the super-resolution as the input of the next decoder module, and finally realizes the semantic segmentation of the network through the two decoder modules;
the super-resolution domain distinguishing module consists of five convolution layers with convolution kernel size of 3 multiplied by 3 and step length of 2, a residual error feature pyramid attention module and a sigmoid activation layer, wherein feature extraction of a high-resolution image is realized through the convolution layers, feature extraction and integration are performed on a network through the feature pyramid attention module, and finally a final super-resolution domain label feature map is obtained through the sigmoid activation layer;
the semantic segmentation domain distinguishing module consists of five convolution layers with convolution kernel size of 3 multiplied by 3 and step length of 2, a residual error feature pyramid attention module and a sigmoid activation layer, the feature extraction of a semantic segmentation image and a probability map is realized through the convolution layers, and finally, the final semantic segmentation domain label feature map is obtained through the sigmoid activation layer.
In the third step, the data set used in training the remote sensing image super-resolution network is low-resolution target domain data, initial high-resolution target domain data and low-resolution source domain data which are subjected to down-sampling;
in the third step, the loss function used in training the remote sensing image super-resolution network is a mean square loss function, and a calculation formula of the mean square loss function is as follows:
Figure BDA0003067459320000051
in the above formula: x is a super-resolution image generated by a super-resolution network, Y is a real high-resolution image, and N is the number of image pixel points;
the loss function of the super-resolution domain discrimination module in the third step is a mean square loss function, and the calculation formula of the mean square loss function is as follows:
Ldsr=ES[(Is-1)2]+ET[(It)2]
Ldsrinv=ES[(Is)2]+ET[(It-1)2];
in the above formula: l isdsrFor loss of the super-resolution domain discrimination module Dsr in training the generator, LdsrinvFor loss of the super-resolution domain discrimination module Dsr when training the discriminator network, IsSuper-resolution maps generated for source domain low resolution images, ItHigh resolution image generated for down-sampling a low resolution image in the target domain, ESTo expect the loss of all inputs belonging to the source domain S, ETThe loss of all the target domains T is expected;
in the third step, when a super-resolution network E-SR consisting of the characteristic extraction module E and the super-resolution module SR is trained, the loss function L is minimizedGTo optimize the parameter theta of the super-resolution network discriminator sectionE-SR
When the super-resolution domain discrimination module Dsr is trained, the loss function L is minimizedDTo optimize the network parameter theta of the domain discriminator part of the super-resolution domain discrimination moduleDsr
The super-resolution of the image is realized through the alternative confrontation training of a super-resolution network and a super-resolution domain discrimination module;
the loss function of the super-resolution network in the training process is as follows:
Figure BDA0003067459320000052
the loss function of the super-resolution domain discrimination module in the training process is as follows:
Figure BDA0003067459320000053
in the fourth step, a loss function used in training the remote sensing image semantic segmentation network is a Dice coefficient loss function and a cross entropy loss function which are jointly used as loss functions, wherein a calculation formula of the cross entropy loss function is as follows:
Figure BDA0003067459320000061
in the above formula: y is a real label graph, y' is a predicted label graph, and N is the number of pixel points of the image;
the calculation formula of the Dice coefficient loss function is as follows:
Figure BDA0003067459320000062
in the above formula: x is a generated prediction label probability graph, Y is a real label graph, | X |, N.Y | is an intersection between the real label graph and the prediction label graph, | X | is the number of elements of the prediction label graph, | Y | is the number of elements of the real label, and K is the category number of the label;
the loss function of the super-resolution domain discrimination module in the fourth step is a mean square loss function, and the calculation formula of the mean square loss function is as follows:
Lds=ES[(Ls)2]+ET[(Lt-1)2]
Ldsinv=ES[(Ls-1)2]+ET[(Lt)2];
in the above formula: l isdsFor loss of the semantic Domain discriminant Module Ds when training the generators, LdsinvFor loss of the semantic Domain partition discrimination Module Ds in training the discriminator, LsSemantic segmentation domain discrimination Module Label map, L, generated for Source Domain imagestSemantic segmentation domain discrimination Module tag map generated for target Domain images, ESTo expect the loss of all inputs belonging to the source domain S, ETThe loss of all the target domains T is expected;
in the fourth step, when a semantic segmentation network E-SR-S consisting of the feature extraction module E, the semantic segmentation module S and the super-resolution module SR is trained, a loss function L is minimizedGTo optimize the parameter theta of a semantic segmentation networkE-SR-SWherein the loss function LGThe sum of the cross entropy loss function, the Dice coefficient loss function, the super resolution loss and the loss of a super resolution domain discrimination module and a semantic segmentation domain discrimination module is obtained;
when training the super-resolution domain discrimination module Dsr and the semantic segmentation domain discrimination module Ds, the loss function L is minimizedDTo optimize the network parameters theta of the two domain discrimination modulesDsr
The semantic segmentation of the image is realized through the alternative countermeasure training of a semantic segmentation network, a semantic segmentation domain judging module Dsr and a super-resolution domain judging module Ds;
the loss function of the semantic segmentation network in the training process is as follows:
Figure BDA0003067459320000063
loss functions of the super-resolution domain discrimination module and the semantic segmentation domain discrimination module in the training process are as follows:
Figure BDA0003067459320000071
compared with the prior art, the invention has the beneficial effects that:
1) the method of the invention uses a residual error feature pyramid attention module in a feature coding module, and ensures that more global information is extracted by extracting image features under different resolutions. The structure can adopt convolution kernels of different receptive fields to achieve feature acquisition of targets of different sizes, and can be combined with residual connection to avoid explosion and disappearance of gradients.
2) The problem of unbalanced variety in semantic segmentation of the high-resolution remote sensing image can be effectively solved.
3) The method adopts the characteristics of jump connection transmission from the super-resolution module when the image semantic segmentation network is built. In the image processing, the feature map transmitted from the super-resolution process through jump connection not only contains the position, edge and other detailed features of the target, but also contains a large amount of high-level semantic information; the method has good segmentation effect and strong robustness.
Drawings
The invention is further described below with reference to the accompanying drawings:
FIG. 1 is a flow chart of an embodiment of an image semantic segmentation network constructed in the method of the present invention;
FIG. 2 is a schematic diagram of a structure of an image semantic segmentation network constructed in the method of the present invention;
FIG. 3 is a schematic diagram of a super-resolution network in an image semantic segmentation network constructed in the method of the present invention;
FIG. 4 is a schematic diagram of a component structure of a feature coding module in an image semantic segmentation network constructed in the method of the present invention;
FIG. 5 is a schematic diagram of a structure of a residual feature pyramid attention module in the image semantic segmentation network constructed in the method of the present invention;
FIG. 6 is a schematic diagram of a structure of a feature pyramid attention module in an image semantic segmentation network constructed in the method of the present invention;
FIG. 7 is a schematic diagram of a structure of a residual error module in an image semantic segmentation network constructed by the method of the present invention;
FIG. 8 is a schematic diagram of a super-resolution domain discrimination module in the image semantic segmentation network constructed in the method of the present invention;
FIG. 9 is a schematic diagram of the structure of the semantic segmentation domain discrimination module in the image semantic segmentation network constructed by the method of the present invention.
Detailed Description
As shown in fig. 1 to 9, in the unsupervised remote sensing image semantic segmentation method based on super-resolution and domain adaptation, the feature pyramid attention module replaces the ASPP module in the original unsupervised remote sensing image semantic segmentation method based on super-resolution domain adaptation, and the residual feature pyramid attention module is applied to the discriminator to acquire accurate pixel level attention for high-level semantic features; the method for relieving the class imbalance problem through the Dice coefficient loss function comprises the following steps:
the method comprises the following steps: obtaining a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set, wherein the source domain remote sensing image data set and the target domain remote sensing image data set are obtained through a remote sensing satellite; the source domain data comprises a low-resolution original image and an artificially marked label data image, and the target domain data only comprises a high-resolution original image; dividing an acquired target domain image data set into a training image and a test image according to a certain proportion, forming a training set image of image semantic segmentation by using source domain data and a target domain training set image, and forming a test set image of image semantic segmentation by using a test set image of a target domain;
preprocessing the remote sensing image data of the training set to obtain a remote sensing image data set subjected to data enhancement;
step two: building a semantic segmentation network of the remote sensing image: the remote sensing image semantic segmentation network comprises a feature coding module, a super-resolution domain distinguishing module, a semantic segmentation module and a semantic segmentation domain distinguishing module, and the construction steps comprise:
step 2.1: and inputting the image into a characteristic coding module to obtain multi-scale and multi-level image characteristics. The feature coding module realizes the hierarchical feature extraction of the image from the bottom level detail feature to the high level semantic feature through convolution and maximum pooling operation. Performing convolution and maximum pooling operation on the image for three times to extract bottom-layer features of the network, extracting multi-scale features of the image through a residual feature pyramid attention module, fusing the extracted multi-scale features of the image through a residual network module, and repeating the extraction and fusion of the multi-scale features for two times to finally obtain rich image features with multiple scales and multiple levels;
step 2.2: and 2.1, obtaining a high-resolution image with the enlarged size of the original image by the image features extracted by the feature coding module in the step 2.1 through the super-resolution module, and inputting the image generated by the super-resolution module into the super-resolution domain distinguishing module to realize the distinguishing of the domain to which the input image belongs so as to optimize the parameters of the super-resolution module and the feature coding module. The feature coding module and the super-resolution module jointly form a super-resolution network of the remote sensing image, and the super-resolution network is combined with the super-resolution domain distinguishing module to jointly realize the super-resolution of the low-resolution image;
step 2.3: and (3) taking the features extracted by the feature coding module and the super-resolution module trained in the step 2.2 as the input of the semantic segmentation module. The features extracted by the feature coding module and part of the features in the image super-resolution module are input into the semantic segmentation module of the image together, so that the semantic segmentation of the remote sensing image data is realized. The feature extraction module, the super-resolution module and the semantic segmentation module jointly form a semantic segmentation network of the image. Splicing the high-resolution remote sensing image and the probability map generated by the semantic segmentation module, inputting the spliced probability map into the semantic segmentation domain discrimination module to realize optimization of the image semantic segmentation network, and finally realizing the segmentation function of the remote sensing image
Step three: pre-training and parameter optimization of the super-resolution network:
step 3.1: carrying out parameter random initialization on the feature coding module and the super-resolution module, inputting the training set data preprocessed in the step one into the remote sensing image super-resolution network in the step two to generate a high-resolution image, and calculating super-resolution loss; and inputting the generated high-resolution image and the original high-resolution image into a super-resolution domain discrimination module to realize discrimination of the domain to which the input image belongs and calculate the discrimination loss of the super-resolution domain.
Step 3.2: loss reverse propagation is performed, parameters of the super-resolution network and the super-resolution domain discrimination module are alternately optimized, and finally super resolution of the low-resolution image is achieved.
Step 3.3: after training is finished, storing the parameters of the trained feature coding module, the super-resolution module and the super-resolution domain distinguishing module;
step four: training a semantic segmentation model of the remote sensing image:
step 4.1: initializing a feature coding module, a super-resolution module and a super-resolution domain distinguishing module in the semantic segmentation network by using the model parameters saved in the third step, simultaneously performing random initialization on the parameters of the semantic segmentation module and the semantic segmentation domain distinguishing module, inputting the training set data preprocessed in the first step into the remote sensing image semantic segmentation network in the second step, generating a semantic segmentation probability map of the remote sensing image, and calculating semantic segmentation loss; and splicing the high-resolution remote sensing image and the semantic segmentation probability map, and inputting the spliced high-resolution remote sensing image and the semantic segmentation probability map into a semantic segmentation domain discrimination module to realize discrimination of the domain to which the semantic segmentation network generation probability map belongs and calculate the discrimination loss of the semantic segmentation domain.
Step 4.2: and (4) loss back propagation, alternately optimizing parameters of the semantic segmentation network and the two domain discrimination modules, and finally finishing the optimization of the parameters of the semantic segmentation network by taking the minimization of a loss function as an optimization target.
Step 4.3: after training is finished, storing the trained semantic segmentation network model parameters;
step five: and inputting the processed test set data into the trained semantic segmentation network of the remote sensing image in the fourth step, and outputting an accurate segmentation result of the remote sensing image.
Preprocessing the remote sensing image training set data in the first step, wherein the preprocessing comprises image cutting, image sampling and data enhancement of original images in the training set;
the image clipping specifically comprises the following steps: cutting the source domain training set image and the label into an image with 256 pixels multiplied by 256 pixels and the resolution of 1 meter per pixel; cutting the target domain training and testing set image into an image with 512 pixels multiplied by 512 pixels and 0.5 meter per pixel resolution;
the image sampling specifically comprises: and upsampling the source domain image and the label to obtain a high-resolution image with 512 pixels multiplied by 512 pixels and the resolution of 0.5 meter per pixel. And downsampling the target domain image to obtain a low-resolution image with the resolution of 1 meter per pixel and the pixel of 256 pixels.
The data enhancement comprises: and carrying out image rotation, image vertical and horizontal overturning and image size adjustment on the images in the semantic segmentation training set of the remote sensing images.
The step 2.1 is as follows as the network structure of the feature coding module:
the first, second and third layers are convolution layers: performing convolution with convolution kernel size of 3 × 3 and step size of 1;
the fourth layer is a maximum pooling layer: the largest pooling layer with the step length of 2 is arranged behind the convolution layer;
the feature coding module is provided with two repeated residual feature pyramid attention modules and a residual module for realizing multi-scale feature fusion behind the maximum pooling layer; the residual error feature pyramid attention module consists of three continuous feature pyramid attention module networks containing residual error connection; the feature pyramid attention module is divided into two paths, the first path realizes network feature transfer by adopting global pooling, convolution with convolution kernel of 1 × 1 and an up-sampling layer for input features, the second path adopts a U-shaped network structure to realize multi-layer feature extraction, and convolution operation with step length of 2 is carried out on the features for three times to obtain feature graphs with different sizes of input feature sizes 1/2, 1/4 and 1/8, wherein the convolution kernels of the three times of convolution are 7 × 7, 5 × 5 and 3 × 3 respectively. Sampling the characteristic diagram with the size of 1/8, overlapping the characteristic diagram with the size of 1/4, repeating the steps twice, and finally multiplying the output characteristic diagram and the characteristic diagram which is subjected to 1 multiplied by 1 convolution pixel by pixel to obtain the characteristic diagram with the same size as the input characteristic diagram; and finally, overlapping the characteristic diagrams of the two paths to obtain a multi-scale characteristic diagram. And the combination of two convolution operations in the residual error module realizes the fusion of the characteristic channels by convolution with the step length of 1 and the convolution kernel of 3 multiplied by 3, and the residual error module is internally provided with a short circuit connection for accelerating the network convergence.
The super-resolution module reduces the number of output channels by half through the decoder module, and simultaneously gradually restores the size of the image. The decoder module comprises a convolution layer with convolution kernel of 1 multiplied by 1 and step length of 1 and a deconvolution layer with convolution kernel of 3 multiplied by 3 and step length of 2, and is used for controlling the number of output channels and the rise of image resolution; and finally obtaining a super-resolution image of the image through a continuous three-time deconvolution module.
The semantic segmentation module gradually restores the resolution of the image through the decoder module, and simultaneously cascades the decoder module and the image with the same resolution restored by the super-resolution as the input of the next decoder module, and finally realizes the semantic segmentation of the network through the two decoder modules.
The super-resolution domain distinguishing module comprises five convolution layers with the convolution kernel size of 3 multiplied by 3 and the step length of 2, a residual feature pyramid attention module and a sigmoid activation layer, wherein feature extraction of a high-resolution image is achieved through the convolution layers, then feature extraction and integration are conducted on a network through the residual feature pyramid attention module, and finally a final super-resolution domain label feature map is obtained through the sigmoid activation layer.
The semantic segmentation domain distinguishing module comprises five convolution layers with the convolution kernel size of 3 multiplied by 3 and the step length of 2, a residual error feature pyramid attention module and a sigmoid activation layer, the feature extraction of a semantic segmentation image and a probability map is realized through the convolution layers, then the feature extraction and integration are carried out on the network through the residual error feature pyramid attention module, and finally the final semantic segmentation domain label feature map is obtained through the sigmoid activation layer.
In the third step, the data set used in training the remote sensing image super-resolution network is low-resolution target domain data, initial high-resolution target domain data and low-resolution source domain data which are subjected to down-sampling, and the loss function is a mean square loss function as a loss function, wherein the calculation formula of the mean square loss function is as follows:
Figure BDA0003067459320000101
in the above formula: x is a super-resolution image generated by a super-resolution network, Y is a real high-resolution image, and N is the number of image pixel points;
the loss function of the super-resolution domain discrimination module in the third step is a mean square loss function, and the calculation formula of the mean square loss function is
Ldsr=ES[(Is-1)2]+ET[(It)2] (2)
Ldsrinv=ES[(Is)2]+ET[(It-1)2] (3)
In the above formula: i issSuper-resolution maps generated for source domain low resolution images, ItA high resolution image generated for down-sampling the low resolution image in the target domain; l isdsrLoss of a super-resolution domain discrimination module Dsr when a generator is trained; l isdsrinvLoss of a super-resolution domain discrimination module Dsr when a discriminator network is trained;
and the countermeasure training process of the super-resolution network in the third step is to alternately optimize the super-resolution network and the super-resolution domain self-adaptive module. When training the super-resolution network E-SR formed by the characteristic extraction module E and the super-resolution module SR, the loss function L is minimizedGOptimizing parameters of the super-resolution network; when the super-resolution domain discrimination module Dsr is trained, the loss function L is minimizedDTo optimize the network parameters of the domain discriminator section. And realizing the super-resolution of the image through the alternative confrontation training of the super-resolution network and the super-resolution domain discriminator module. The loss function of the network during training is:
Figure BDA0003067459320000111
Figure BDA0003067459320000112
in the fourth step, a loss function used in training the remote sensing image semantic segmentation network is a Dice coefficient loss function and a cross entropy loss function which are jointly used as loss functions, wherein a calculation formula of the cross entropy loss function is as follows:
Figure BDA0003067459320000113
in the above formula: y is a real label graph, y' is a predicted label graph, N is the number of pixel points of the image, and K;
the calculation formula of the Dice coefficient loss function is as follows:
Figure BDA0003067459320000114
in the above formula: x is a generated prediction label probability graph, Y is a real label graph, | X |, N.Y | is an intersection between the real label graph and the prediction label graph, | X | is the number of elements of the prediction label graph, | Y | is the number of elements of the real label, and K is the category number of the label;
the loss function of the super-resolution domain discrimination module in the fourth step is a mean square loss function, and the calculation formula of the mean square loss function is
Lds=ES[(Ls)2]+ET[(Lt-1)2] (8)
Ldsinv=ES[(Ls-1)2]+ET[(Lt)2] (9)
In the above formula: l issSemantic segmentation domain discrimination Module Label map, L, generated for Source Domain imagestIs a target domain mapJudging a module label graph by the generated semantic segmentation domain; l isdsLoss of a semantic segmentation domain discrimination module Ds when training a generator; l isdsinvLoss of a semantic segmentation domain discrimination module Ds when a discriminator is trained;
the countermeasure training process of the semantic segmentation network in the fourth step is alternately optimizing the semantic segmentation network, the super-resolution domain judging module and the semantic segmentation domain judging module. When training a semantic segmentation network E-SR-S consisting of a feature extraction module E, a semantic segmentation module S and a super-resolution module SR, a loss function L is minimizedGTo optimize parameters of a semantically segmented network, wherein a loss function LGThe sum of the cross entropy loss function, the Dice coefficient loss function, the super-resolution loss and the loss of the two domain discrimination modules is obtained; when training the super-resolution domain discrimination module Dsr and the semantic segmentation domain discrimination module Ds, the loss function L is minimizedDNetwork parameters of the two domain discrimination modules are optimized. And the semantic segmentation network E-SR-S, the semantic segmentation domain discriminator module Dsr and the super-resolution domain discrimination module Ds alternately resist training to realize the semantic segmentation of the image. The loss function of the network during training is:
Figure BDA0003067459320000121
Figure BDA0003067459320000122
dividing remote sensing image data sets under different resolutions into a training set and a test set according to a certain proportion; constructing a depth remote sensing image semantic segmentation network combining super-resolution and domain self-adaptation: the remote sensing image semantic segmentation network comprises a feature coding module, a super-resolution domain distinguishing module, a semantic segmentation module and a semantic segmentation domain distinguishing module; inputting the preprocessed training set data into a remote sensing image semantic segmentation network, training the remote sensing image super-resolution network and the image semantic segmentation network in a segmented manner, and storing network parameters; and inputting the test set data into the trained remote sensing image semantic segmentation network, and outputting the segmentation result of the test image data.
The invention discloses a semantic segmentation method for an unsupervised remote sensing image by combining super-resolution and domain self-adaptation. Dividing the image into a test set and a training set, and preprocessing the image in the training set; then, a remote sensing image semantic segmentation network based on a deep learning network is constructed, a training set image segmentation training remote sensing super-resolution network and the semantic segmentation network are input, and model parameters are stored when the network converges; and finally, obtaining a final prediction result image from the test set image through an image semantic segmentation network. Compared with the prior art, the method realizes semantic segmentation of the remote sensing image by adding the super-resolution module and the multi-size feature extraction module. The method has the advantages of good segmentation effect and strong robustness.
It should be noted that, regarding the specific structure of the present invention, the connection relationship between the modules adopted in the present invention is determined and can be realized, except for the specific description in the embodiment, the specific connection relationship can bring the corresponding technical effect, and the technical problem proposed by the present invention is solved on the premise of not depending on the execution of the corresponding software program.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation is characterized by comprising the following steps of: the method comprises the following steps:
the method comprises the following steps: acquiring a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set, and dividing the acquired target domain image data set into a training image and a test image according to a set proportion;
forming a training set of image semantic segmentation by using the source domain image data set and the target domain training image, and forming a test set of image semantic segmentation by using the test image of the target domain;
preprocessing the remote sensing image data of the training set to obtain a remote sensing image data set subjected to data enhancement;
step two: building a remote sensing image semantic segmentation network, wherein the super-resolution remote sensing image semantic segmentation network comprises a feature coding module, a super-resolution domain distinguishing module, a semantic segmentation module and a semantic segmentation domain distinguishing module, and the feature coding module and the super-resolution module jointly form the super-resolution network of the remote sensing image;
step three: performing network pre-training and parameter optimization on the super-resolution network built in the step two;
step four: training a semantic segmentation network of the remote sensing image;
step five: and inputting the preprocessed test set data into the trained semantic segmentation network of the remote sensing image in the fourth step, and outputting an accurate segmentation result of the remote sensing image.
2. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: and acquiring a source domain low-resolution remote sensing image data set and a target domain high-resolution remote sensing image data set in the first step through a remote sensing satellite, wherein the source domain low-resolution remote sensing image data set comprises a low-resolution original image and a label data image which is artificially marked, and the target domain high-resolution remote sensing image data set comprises a high-resolution original image.
3. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: preprocessing the source domain image data set and the target domain training set image remote sensing image data in the first step specifically comprises image cutting, image sampling and data enhancement of original images in a training set;
the image clipping specifically comprises the following steps: cutting the original image of the source domain and the label data image into images with 256 pixels multiplied by 256 pixels and the resolution of 1 meter per pixel; cutting the target domain training set and the test set image into images with 512 pixels multiplied by 512 pixels and 0.5 meter per pixel resolution;
the image sampling specifically comprises: up-sampling the original image of the source domain and the label data image to obtain a high-resolution image with 512 pixels multiplied by 512 pixels and the resolution of 0.5 meter per pixel; down-sampling the target domain image to obtain a low-resolution image with 256 pixels multiplied by 256 pixels and the resolution of 1 meter per pixel;
the data enhancement comprises: and carrying out image rotation, image vertical and horizontal overturning and image size adjustment on the images in the semantic segmentation training set of the remote sensing images.
4. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: the second step of building the super-resolution remote sensing image semantic segmentation network comprises the following steps:
step 2.1: inputting the image into a feature coding module to obtain multi-scale and multi-level image features: the feature coding module realizes the hierarchical feature extraction of the image from the bottom level detail feature to the high level semantic feature through convolution and maximum pooling operation, and specifically comprises the following steps: performing convolution and maximum pooling operation on the image for three times to extract bottom-layer features of the network, extracting multi-scale features of the image through a residual feature pyramid attention module, fusing the extracted multi-scale features of the image through a residual network module, and repeating the extraction and fusion of the multi-scale features for two times to finally obtain rich image features with multiple scales and multiple levels;
step 2.2: 2.1, the image features extracted by the feature coding module in the step 2.1 are processed by the super-resolution module to obtain a high-resolution image with the enlarged size of the original image, then the image generated by the super-resolution module is input into the super-resolution domain judging module, and the domain to which the input image belongs is judged to optimize the parameters of the super-resolution module and the feature coding module;
the feature coding module and the super-resolution module jointly form a super-resolution network of the remote sensing image, and the super-resolution network is combined with the super-resolution domain distinguishing module to jointly realize the super-resolution of the low-resolution image;
step 2.3: the features extracted by the feature coding module and the super-resolution module trained in the step 2.2 are taken as the input of the semantic segmentation module together: the features extracted by the feature coding module and part of the features in the image super-resolution module are input into a semantic segmentation module of the image together to realize semantic segmentation of the remote sensing image data;
the system comprises a feature extraction module, a super-resolution module, a semantic segmentation module and a semantic segmentation module, wherein the feature extraction module, the super-resolution module and the semantic segmentation module jointly form a semantic segmentation network of an image, a high-resolution remote sensing image and a probability map generated by the semantic segmentation module are spliced and then input into the semantic segmentation domain discrimination module to optimize the semantic segmentation network of the image, and finally, the segmentation function of the remote sensing image is realized.
5. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: the third step specifically comprises:
step 3.1: carrying out parameter random initialization on the feature coding module and the super-resolution module, inputting the training set data preprocessed in the step one into the remote sensing image super-resolution network in the step two to generate a high-resolution image, and calculating super-resolution loss;
inputting the generated high-resolution image and the original high-resolution image into a super-resolution domain discrimination module, discriminating the domain of the input image, and calculating the discrimination loss of the super-resolution domain;
step 3.2: loss reverse propagation is performed, parameters of a super-resolution network and a super-resolution domain discrimination module are alternately optimized, and super resolution of the low-resolution image is finally achieved;
step 3.3: and after the training is finished, storing the parameters of the trained feature coding module, the super-resolution module and the super-resolution domain distinguishing module.
6. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: the fourth step specifically comprises:
step 4.1: initializing a feature coding module, a super-resolution module and a super-resolution domain distinguishing module in the semantic segmentation network by using the model parameters saved in the third step, simultaneously performing random initialization on the parameters of the semantic segmentation module and the semantic segmentation domain distinguishing module, inputting the training set data preprocessed in the first step into the remote sensing image semantic segmentation network in the second step, generating a semantic segmentation probability map of the remote sensing image, and calculating semantic segmentation loss;
splicing the high-resolution remote sensing image and the semantic segmentation probability map, and inputting the spliced high-resolution remote sensing image and the semantic segmentation probability map into a semantic segmentation domain discrimination module to realize discrimination of a domain to which the semantic segmentation network generation probability map belongs and calculate discrimination loss of the semantic segmentation domain;
step 4.2: loss back propagation, namely alternately optimizing parameters of the semantic segmentation network and the two domain discrimination modules, and finally finishing the optimization of the parameters of the semantic segmentation network by taking the minimization of a loss function as an optimization target;
step 4.3: and after the training is finished, storing the trained semantic segmentation network model parameters.
7. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: the network structure of the feature coding module in step 2.1 is as follows:
the first, second and third layers are convolution layers: performing convolution with convolution kernel size of 3 × 3 and step size of 1;
the fourth layer is a maximum pooling layer: the largest pooling layer with the step length of 2 is arranged behind the convolution layer;
the feature coding module is provided with two repeated residual feature pyramid attention modules and a residual module for realizing multi-scale feature fusion behind the maximum pooling layer;
the residual error feature pyramid attention module is composed of three continuous feature pyramid attention module networks containing residual error connection, the feature pyramid attention module is divided into two paths, the first path adopts global pooling for input features, convolution with convolution kernel of 1 × 1 and an upper sampling layer to achieve feature transfer of the network, the second path adopts a U-shaped network structure to achieve multi-layer feature extraction, convolution operation with step length of 2 is conducted on the features three times to obtain feature maps with different sizes of input feature sizes 1/2, 1/4 and 1/8, the convolution kernels of the convolution three times are respectively 7 × 7, 5 × 5 and 3 × 3 in size, then the feature map with size of 1/8 is sampled and overlapped with the feature map with size of 1/4, the steps are repeated twice, and finally the output feature map and the feature map which is subjected to convolution with size of 1 × 1 are multiplied pixel by pixel to obtain the feature map with the same size as the input feature map Figure representation; finally, overlapping the characteristic diagrams of the two paths to obtain a multi-scale characteristic diagram;
and the two convolution operations in the residual error module realize the characteristic channel fusion by the convolution with the step length of 1 and the convolution kernel of 3 multiplied by 3, and the residual error module is internally provided with a short circuit connection for accelerating the network convergence.
8. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that:
the super-resolution module reduces the number of output channels by half through a decoder module, and simultaneously gradually restores the size of an image; the decoder module comprises a convolution layer with convolution kernel of 1 multiplied by 1 and step length of 1 and a deconvolution layer with convolution kernel of 3 multiplied by 3 and step length of 2, and is used for controlling the number of output channels and the rise of image resolution; finally obtaining a super-resolution image of the image through a continuous three-time deconvolution module;
the semantic segmentation module gradually restores the resolution of the image through the decoder module, and simultaneously cascades the decoder module and the image with the same resolution restored by the super-resolution as the input of the next decoder module, and finally realizes the semantic segmentation of the network through the two decoder modules;
the super-resolution domain distinguishing module consists of five convolution layers with convolution kernel size of 3 multiplied by 3 and step length of 2, a residual error feature pyramid attention module and a sigmoid activation layer, wherein feature extraction of a high-resolution image is realized through the convolution layers, feature extraction and integration are performed on a network through the feature pyramid attention module, and finally a final super-resolution domain label feature map is obtained through the sigmoid activation layer;
the semantic segmentation domain distinguishing module consists of five convolution layers with convolution kernel size of 3 multiplied by 3 and step length of 2, a residual error feature pyramid attention module and a sigmoid activation layer, the feature extraction of a semantic segmentation image and a probability map is realized through the convolution layers, and finally, the final semantic segmentation domain label feature map is obtained through the sigmoid activation layer.
9. The unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: in the third step, the data set used in training the remote sensing image super-resolution network is low-resolution target domain data, initial high-resolution target domain data and low-resolution source domain data which are subjected to down-sampling;
in the third step, the loss function used in training the remote sensing image super-resolution network is a mean square loss function, and a calculation formula of the mean square loss function is as follows:
Figure FDA0003067459310000041
in the above formula: x is a super-resolution image generated by a super-resolution network, Y is a real high-resolution image, and N is the number of image pixel points;
the loss function of the super-resolution domain discrimination module in the third step is a mean square loss function, and the calculation formula of the mean square loss function is as follows:
Ldsr=ES[(Is-1)2]+ET[(It)2]
Ldsrinv=ES[(Is)2]+ET[(It-1)2];
in the above formula: l isdsrFor loss of the super-resolution domain discrimination module Dsr in training the generator, LdsrinvFor loss of the super-resolution domain discrimination module Dsr when training the discriminator network, IsSuper-resolution maps generated for source domain low resolution images, ItHigh resolution image generated for down-sampling a low resolution image in the target domain, ESTo expect the loss of all inputs belonging to the source domain S, ETThe loss of all the target domains T is expected;
in the third step, when a super-resolution network E-SR consisting of the characteristic extraction module E and the super-resolution module SR is trained, the loss function L is minimizedGTo optimize the parameter theta of the super-resolution network discriminator sectionE-SR
When the super-resolution domain discrimination module Dsr is trained, the loss function L is minimizedDTo optimize the network parameter theta of the domain discriminator part of the super-resolution domain discrimination moduleDsr
The super-resolution of the image is realized through the alternative confrontation training of a super-resolution network and a super-resolution domain discrimination module;
the loss function of the super-resolution network in the training process is as follows:
Figure FDA0003067459310000051
the loss function of the super-resolution domain discrimination module in the training process is as follows:
Figure FDA0003067459310000052
10. the unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation according to claim 1, characterized in that: in the fourth step, a loss function used in training the remote sensing image semantic segmentation network is a Dice coefficient loss function and a cross entropy loss function which are jointly used as loss functions, wherein a calculation formula of the cross entropy loss function is as follows:
Figure FDA0003067459310000053
in the above formula: y is a real label graph, y' is a predicted label graph, and N is the number of pixel points of the image;
the calculation formula of the Dice coefficient loss function is as follows:
Figure FDA0003067459310000054
in the above formula: x is a generated prediction label probability graph, Y is a real label graph, | X |, N.Y | is an intersection between the real label graph and the prediction label graph, | X | is the number of elements of the prediction label graph, | Y | is the number of elements of the real label, and K is the category number of the label;
the loss function of the super-resolution domain discrimination module in the fourth step is a mean square loss function, and the calculation formula of the mean square loss function is as follows:
Lds=ES[(Ls)2]+ET[(Lt-1)2]
Ldsinv=ES[(Ls-1)2]+ET[(Lt)2];
in the above formula: l isdsFor loss of the semantic Domain discriminant Module Ds when training the generators, LdsinvFor loss of the semantic Domain partition discrimination Module Ds in training the discriminator, LsSemantic segmentation domain discrimination Module Label map, L, generated for Source Domain imagestSemantic segmentation domain discrimination Module tag map generated for target Domain images, ESTo expect the loss of all inputs belonging to the source domain S, ETThe loss of all the target domains T is expected;
in the fourth step, when a semantic segmentation network E-SR-S consisting of the feature extraction module E, the semantic segmentation module S and the super-resolution module SR is trained, a loss function L is minimizedGTo optimize the parameter theta of a semantic segmentation networkE-SR-SWherein the loss function LGThe sum of the cross entropy loss function, the Dice coefficient loss function, the super resolution loss and the loss of a super resolution domain discrimination module and a semantic segmentation domain discrimination module is obtained;
when training the super-resolution domain discrimination module Dsr and the semantic segmentation domain discrimination module Ds, the loss function L is minimizedDTo optimize the network parameters theta of the two domain discrimination modulesDsr
The semantic segmentation of the image is realized through the alternative countermeasure training of a semantic segmentation network, a semantic segmentation domain judging module Dsr and a super-resolution domain judging module Ds;
the loss function of the semantic segmentation network in the training process is as follows:
Figure FDA0003067459310000061
loss functions of the super-resolution domain discrimination module and the semantic segmentation domain discrimination module in the training process are as follows:
Figure FDA0003067459310000062
CN202110530385.2A 2021-05-14 2021-05-14 Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation Active CN113160234B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110530385.2A CN113160234B (en) 2021-05-14 2021-05-14 Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110530385.2A CN113160234B (en) 2021-05-14 2021-05-14 Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation

Publications (2)

Publication Number Publication Date
CN113160234A true CN113160234A (en) 2021-07-23
CN113160234B CN113160234B (en) 2021-12-14

Family

ID=76876131

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110530385.2A Active CN113160234B (en) 2021-05-14 2021-05-14 Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation

Country Status (1)

Country Link
CN (1) CN113160234B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113496221A (en) * 2021-09-08 2021-10-12 湖南大学 Point supervision remote sensing image semantic segmentation method and system based on depth bilateral filtering
CN113569724A (en) * 2021-07-27 2021-10-29 中国科学院地理科学与资源研究所 Road extraction method and system based on attention mechanism and dilation convolution
CN113592745A (en) * 2021-09-08 2021-11-02 辽宁师范大学 Unsupervised MRI image restoration method based on antagonism domain self-adaptation
CN113610807A (en) * 2021-08-09 2021-11-05 西安电子科技大学 New coronary pneumonia segmentation method based on weak supervision multitask learning
CN113807356A (en) * 2021-07-29 2021-12-17 北京工商大学 End-to-end low visibility image semantic segmentation method
CN113888547A (en) * 2021-09-27 2022-01-04 太原理工大学 Non-supervision domain self-adaptive remote sensing road semantic segmentation method based on GAN network
CN113888406A (en) * 2021-08-24 2022-01-04 厦门仟易网络科技有限公司 Camera super-resolution method through deep learning
CN114005043A (en) * 2021-10-29 2022-02-01 武汉大学 Small sample city remote sensing image information extraction method based on domain conversion and pseudo label
CN115311138A (en) * 2022-07-06 2022-11-08 北京科技大学 Image super-resolution method and device
CN117911705A (en) * 2024-03-19 2024-04-19 成都理工大学 Brain MRI (magnetic resonance imaging) tumor segmentation method based on GAN-UNet variant network

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150154739A1 (en) * 2013-11-30 2015-06-04 Sharp Laboratories Of America, Inc. Image enhancement using semantic components
CN108710830A (en) * 2018-04-20 2018-10-26 浙江工商大学 A kind of intensive human body 3D posture estimation methods for connecting attention pyramid residual error network and equidistantly limiting of combination
CN108875595A (en) * 2018-05-29 2018-11-23 重庆大学 A kind of Driving Scene object detection method merged based on deep learning and multilayer feature
CN110097129A (en) * 2019-05-05 2019-08-06 西安电子科技大学 Remote sensing target detection method based on profile wave grouping feature pyramid convolution
CN110136062A (en) * 2019-05-10 2019-08-16 武汉大学 A kind of super resolution ratio reconstruction method of combination semantic segmentation
CN111127493A (en) * 2019-11-12 2020-05-08 中国矿业大学 Remote sensing image semantic segmentation method based on attention multi-scale feature fusion
CN112183258A (en) * 2020-09-16 2021-01-05 太原理工大学 Remote sensing image road segmentation method based on context information and attention mechanism

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150154739A1 (en) * 2013-11-30 2015-06-04 Sharp Laboratories Of America, Inc. Image enhancement using semantic components
CN108710830A (en) * 2018-04-20 2018-10-26 浙江工商大学 A kind of intensive human body 3D posture estimation methods for connecting attention pyramid residual error network and equidistantly limiting of combination
CN108875595A (en) * 2018-05-29 2018-11-23 重庆大学 A kind of Driving Scene object detection method merged based on deep learning and multilayer feature
CN110097129A (en) * 2019-05-05 2019-08-06 西安电子科技大学 Remote sensing target detection method based on profile wave grouping feature pyramid convolution
CN110136062A (en) * 2019-05-10 2019-08-16 武汉大学 A kind of super resolution ratio reconstruction method of combination semantic segmentation
CN111127493A (en) * 2019-11-12 2020-05-08 中国矿业大学 Remote sensing image semantic segmentation method based on attention multi-scale feature fusion
CN112183258A (en) * 2020-09-16 2021-01-05 太原理工大学 Remote sensing image road segmentation method based on context information and attention mechanism

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BIN PAN 等: "SRDA-Net: Super-Resolution Domain Adaptation Networks for Semantic Segmentation", 《HTTPS://ARXIV.ORG/PDF/2005.06382.PDF》 *
HAIWEI SANG 等: "PCANet: Pyramid convolutional attention network for semantic segmentation", 《IMAGE AND VISION COMPUTING》 *
XUEJUN GUO 等: "Fully convolutional DenseNet with adversarial training for semantic segmentation of high-resolution remote sensing images", 《JOURNAL OF APPLIED REMOTE SENSING》 *
YI-HSUAN TSAI 等: "Learning to Adapt Structured Output Space for Semantic Segmentation", 《2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
陈禾 等: "用于SAR遥感图像车辆型谱级识别的高阶特征表示多尺度残差卷积网络", 《信号处理》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569724A (en) * 2021-07-27 2021-10-29 中国科学院地理科学与资源研究所 Road extraction method and system based on attention mechanism and dilation convolution
CN113807356B (en) * 2021-07-29 2023-07-25 北京工商大学 End-to-end low-visibility image semantic segmentation method
CN113807356A (en) * 2021-07-29 2021-12-17 北京工商大学 End-to-end low visibility image semantic segmentation method
CN113610807A (en) * 2021-08-09 2021-11-05 西安电子科技大学 New coronary pneumonia segmentation method based on weak supervision multitask learning
CN113610807B (en) * 2021-08-09 2024-02-09 西安电子科技大学 New coronaries pneumonia segmentation method based on weak supervision multitask learning
CN113888406B (en) * 2021-08-24 2024-04-23 厦门仟易网络科技有限公司 Camera super-resolution method through deep learning
CN113888406A (en) * 2021-08-24 2022-01-04 厦门仟易网络科技有限公司 Camera super-resolution method through deep learning
CN113592745B (en) * 2021-09-08 2023-11-28 辽宁师范大学 Unsupervised MRI image restoration method based on antagonism domain self-adaption
CN113496221B (en) * 2021-09-08 2022-02-01 湖南大学 Point supervision remote sensing image semantic segmentation method and system based on depth bilateral filtering
CN113496221A (en) * 2021-09-08 2021-10-12 湖南大学 Point supervision remote sensing image semantic segmentation method and system based on depth bilateral filtering
CN113592745A (en) * 2021-09-08 2021-11-02 辽宁师范大学 Unsupervised MRI image restoration method based on antagonism domain self-adaptation
CN113888547A (en) * 2021-09-27 2022-01-04 太原理工大学 Non-supervision domain self-adaptive remote sensing road semantic segmentation method based on GAN network
CN114005043A (en) * 2021-10-29 2022-02-01 武汉大学 Small sample city remote sensing image information extraction method based on domain conversion and pseudo label
CN114005043B (en) * 2021-10-29 2024-04-05 武汉大学 Small sample city remote sensing image information extraction method based on domain conversion and pseudo tag
CN115311138A (en) * 2022-07-06 2022-11-08 北京科技大学 Image super-resolution method and device
CN115311138B (en) * 2022-07-06 2023-06-23 北京科技大学 Image super-resolution method and device
CN117911705A (en) * 2024-03-19 2024-04-19 成都理工大学 Brain MRI (magnetic resonance imaging) tumor segmentation method based on GAN-UNet variant network

Also Published As

Publication number Publication date
CN113160234B (en) 2021-12-14

Similar Documents

Publication Publication Date Title
CN113160234B (en) Unsupervised remote sensing image semantic segmentation method based on super-resolution and domain self-adaptation
CN111047551B (en) Remote sensing image change detection method and system based on U-net improved algorithm
CN113298818B (en) Remote sensing image building segmentation method based on attention mechanism and multi-scale features
CN112966684B (en) Cooperative learning character recognition method under attention mechanism
CN109934200B (en) RGB color remote sensing image cloud detection method and system based on improved M-Net
CN113850825A (en) Remote sensing image road segmentation method based on context information and multi-scale feature fusion
CN115601549B (en) River and lake remote sensing image segmentation method based on deformable convolution and self-attention model
CN109087375B (en) Deep learning-based image cavity filling method
CN111753677B (en) Multi-angle remote sensing ship image target detection method based on characteristic pyramid structure
CN113505792B (en) Multi-scale semantic segmentation method and model for unbalanced remote sensing image
CN113888547A (en) Non-supervision domain self-adaptive remote sensing road semantic segmentation method based on GAN network
CN113850813A (en) Unsupervised remote sensing image semantic segmentation method based on spatial resolution domain self-adaption
CN111178304A (en) High-resolution remote sensing image pixel level interpretation method based on full convolution neural network
CN114494870A (en) Double-time-phase remote sensing image change detection method, model construction method and device
Song et al. PSTNet: Progressive sampling transformer network for remote sensing image change detection
Mu et al. A climate downscaling deep learning model considering the multiscale spatial correlations and chaos of meteorological events
CN113628180B (en) Remote sensing building detection method and system based on semantic segmentation network
CN113313180B (en) Remote sensing image semantic segmentation method based on deep confrontation learning
CN112766381B (en) Attribute-guided SAR image generation method under limited sample
CN112818777B (en) Remote sensing image target detection method based on dense connection and feature enhancement
CN113111740A (en) Characteristic weaving method for remote sensing image target detection
CN117152435A (en) Remote sensing semantic segmentation method based on U-Net3+
CN115082778B (en) Multi-branch learning-based homestead identification method and system
CN115909077A (en) Hyperspectral image change detection method based on unsupervised spectrum unmixing neural network
CN114445726B (en) Sample library establishing method and device based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20221221

Address after: No. 47, Xutan East Street, Taiyuan, Shanxi 030031

Patentee after: Shanxi corps of China Building Materials Industry Geological Exploration Center

Address before: 030024 No. 79 West Main Street, Taiyuan, Shanxi, Yingze

Patentee before: Taiyuan University of Technology

TR01 Transfer of patent right