CN116152666A - Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity - Google Patents
Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity Download PDFInfo
- Publication number
- CN116152666A CN116152666A CN202310258590.7A CN202310258590A CN116152666A CN 116152666 A CN116152666 A CN 116152666A CN 202310258590 A CN202310258590 A CN 202310258590A CN 116152666 A CN116152666 A CN 116152666A
- Authority
- CN
- China
- Prior art keywords
- domain
- seg
- style
- style migration
- heterogeneity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 238000013508 migration Methods 0.000 claims abstract description 86
- 230000005012 migration Effects 0.000 claims abstract description 86
- 230000011218 segmentation Effects 0.000 claims abstract description 59
- 238000012549 training Methods 0.000 claims abstract description 25
- 230000008569 process Effects 0.000 claims abstract description 19
- 230000007246 mechanism Effects 0.000 claims abstract description 15
- 238000005457 optimization Methods 0.000 claims description 35
- 230000003044 adaptive effect Effects 0.000 claims description 24
- 230000006870 function Effects 0.000 claims description 21
- 238000010606 normalization Methods 0.000 claims description 10
- 238000010586 diagram Methods 0.000 claims description 4
- 230000003042 antagnostic effect Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 230000002457 bidirectional effect Effects 0.000 abstract description 8
- 230000000694 effects Effects 0.000 abstract description 8
- 230000003993 interaction Effects 0.000 abstract description 4
- 230000008901 benefit Effects 0.000 abstract description 2
- 230000008859 change Effects 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 230000005855 radiation Effects 0.000 description 4
- 238000006467 substitution reaction Methods 0.000 description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- 238000002441 X-ray diffraction Methods 0.000 description 3
- 230000004075 alteration Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000002679 ablation Methods 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000002845 discoloration Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000035784 germination Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000004660 morphological change Effects 0.000 description 1
- 230000008450 motivation Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012803 optimization experiment Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000001932 seasonal effect Effects 0.000 description 1
- 238000010206 sensitivity analysis Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a cross-domain remote sensing image self-adaptive learning method considering the heterogeneity of ground feature weathers, which comprises the following steps: given a label Y with split A Source phase dataset X of (1) A And a target phase data set X without a tag B Training M st Generating a style migration sample considering the heterogeneity of the ground feature weathers; training M using style migration samples seg Reducing the category characteristic distribution difference of the source domain and the target domain and simultaneously constructing M st And M seg Is a bidirectional optimized learning mechanism for improving cross-time-phase remote sensingImage domain self-adaptive learning capability to complete X B Semantic segmentation tasks of (a). The method has the advantages of reducing the category characteristic distribution difference between different time domains and improving the domain self-adaptive learning effect of the model; the information interaction of the style migration process and the semantic segmentation process is enhanced.
Description
Technical Field
The invention belongs to the technical field of domain self-adaptive learning, and particularly relates to a cross-domain remote sensing image self-adaptive learning method considering the heterogeneity of ground feature weathers.
Background
The remote sensing technology has become an important means for surface coverage classification research by virtue of the characteristics of wide coverage, high space-time resolution, large information quantity and the like. Under the condition of global economy integration, the scale range of the earth surface coverage classification is not limited to the local area of the country, but is expanded to the regional or global scale, and the traditional manual interpretation classification method has large workload and low updating speed, and can not meet the requirements of modern map making. In recent years, the development of remote sensing image interpretation intellectualization has been advanced by the progress of artificial intelligence technology such as deep learning, and a large number of data-driven remote sensing image interpretation methods based on deep learning have emerged.
However, the good performance of existing various data-driven remote sensing image deep learning models requires that the test data and the training data satisfy independent co-distribution assumptions. In a real application scene of multi-temporal remote sensing image classification, marked training data (source domain) and unmarked test data (target domain) are often derived from different data distribution, and have obvious visual style differences, so that the performance of a source domain model in the target domain is poor. In order to solve the problem, in recent years, a lot of research work is devoted to learning the mapping of different image domains by using a deep neural network, and the motivation is to migrate the style of the remote sensing image of a certain domain to another specific domain, so that the generated migration sample is closer to the specific domain sample in visual style, thereby supporting the cross-time-phase remote sensing image domain adaptive learning. Yang et al, in non-patent literature FG-GAN: AFine-Grained Generative Adversarial Network for Unsupervised SAR-to-Optical Image Translation, IEEE Trans. Geosci. Remote Sensing, vol.60, pp.1-11,2022, doi:10.1109/TGRS.2022.3165371, propose integrating dense connection modules and residual modules in a generation network and enhancing the ability of a style migration model to characterize radiation characteristics of a remote Sensing image using a multi-scale discrimination network. Tasar et al in the non-patent literature, "DAugNet: insuperved, multisource, multitarget, and Life-Long Domain Adaptation for Semantic Segmentation of Satellite Images," IEEE Trans. Geosci. Remote Sensing, vol.59, no.2, pp.1067-1081, feb.2021, doi:10.1109/TGRS.2020.3006161 "describe the style characteristics of an image using statistics of each channel of the image characteristics, and simply adjust the mean and variance of each channel of the input image characteristics to match the style characteristics of the target remote Sensing image by adaptive instance normalization. Zhang et al in non-patent document "Remote Sensing Image Translation via Style-Based Recalibration Module and Improved Style Discriminator," IEEE geoci. Remote Sensing Lett., vol.19, pp.1-5,2022, doi: 10.1109/LGRS.2021.306858 "introduce a style-based feature alignment module that assigns learning weights to each channel of image features based on the importance of statistics of each channel of image features to style migration of remote Sensing images, so that the style migration network can obtain style information requiring focus attention in remote Sensing images more quickly.
However, the current technical methods for performing cross-temporal remote sensing image domain adaptive learning by using style migration samples implicitly assume that the visual style changes of all ground object targets in a remote sensing scene are isotropic, and mainly focus on reducing image radiation differences generated in an external imaging process of the remote sensing scene, and neglect the excavation of the influence factors of the differences of the ground objects in the scene. The influence of the difference of the physical condition of the ground object on the visual style change of the remote sensing image is mainly reflected in two aspects. On the one hand, compared with the ground features such as the artificial ground surface and the like which are insensitive to the physical conditions, the ground features which are sensitive to the physical conditions are particularly easy to be affected by the seasonal circulation to generate morphological changes, such as germination, leaf expansion, leaf discoloration, leaf shedding and the like of plants. On the other hand, the climatic laws of different features have heterogeneity [3] For example, the growing season of forests is relatively long, and the climate change generally occurs only once a year, while the growing season of farmland crops is short, and the climate change of farmlands which are twice a year or even three times a year can occur for a plurality of times. Therefore, when the technical methods are used for identifying the target remote sensing image of the geographical area with serious spectrum mixing of the ground objects and relatively broken ecological landscapes, the identification result is difficult to distinguish the boundary of the weather-sensitive ground objects, and meanwhile, the weather-sensitive ground objects (such as farmlands) and weather-insensitive ground objects (such as artificial ground surfaces) are easily confused.
According to the background of the current technical research, for the technical method for performing cross-time-phase remote sensing image domain self-adaptive learning by using style migration samples, the following problems mainly exist to be solved: (1) The visual style change of the remote sensing scene is influenced by the image radiation difference generated in the external imaging process and the physical difference of ground features in the scene, however, the style migration sample generated in the prior art cannot simulate the heterogeneity of the ground feature, and the domain self-adaptive classification effect is hindered. (2) The style migration network cannot interact with the semantic segmentation network in the feature learning process, so that the information transferred by the style migration sample to the domain self-adaptive learning process is insufficient, and the domain self-adaptive learning capacity of the model is limited.
Disclosure of Invention
In view of this, the framework proposed by the present invention comprises a style migration network M st And a semantic segmentation network M seg . Given a label Y with split A Source phase dataset X of (1) A And a target phase data set X without a tag B The goal is to train M st Generating style migration samples considering the heterogeneity of the ground object and training M by using the style migration samples seg Reducing the category characteristic distribution difference of the source domain and the target domain and simultaneously constructing M st and Mseg The bi-directional optimization learning mechanism of the system improves the self-adaptive learning capability of the cross-time-phase remote sensing image domain and completes the X-ray analysis B Semantic segmentation tasks of (a).
The invention discloses a cross-domain remote sensing image self-adaptive learning method considering the heterogeneity of ground feature and physical features, which is appliedMigrating a network M in a style st And a semantic segmentation network M seg The method comprises the following steps:
given a label Y with split A Source phase dataset X of (1) A And a target phase data set X without a tag B Training M st Generating a style migration sample considering the heterogeneity of the ground feature weathers;
training M using the style migration samples seg Reducing the category characteristic distribution difference of the source domain and the target domain and simultaneously constructing M st and Mseg The bi-directional optimization learning mechanism of the system improves the self-adaptive learning capability of the cross-time-phase remote sensing image domain and completes the X-ray analysis B Semantic segmentation tasks of (a).
Further, given an image feature mapAnd a ground object segmentation map with corresponding scaleFor the ground object category k, a category characteristic diagram is obtained>Representing the feature class style of the image by means of the mean and variance of the class feature channel dimensions> wherein ,
by using andFeature class feature style parameter sets respectively representing different image domains, wherein mu k Mean value, sigma k Representing variance, beta k ,γ k Representation mu k ,σ k Is a mathematical expectation of (a); assume that for feature class k, the image domain contains N in total k Sampling pixels, then
A part of samples are randomly selected to initialize beta by using a formula (2) k ,γ k Gradually updating mu with moving average during model training k ,γ k :
β k ←λβ k +(1-λ)μ k (F)
γ k ←λγ k +(1-λ)σ k (F)
Where λ is the momentum coefficient, set to 0.9999;
the embedded features are subjected to category-by-category regular constraint through adaptive segmentation sample normalization, and the method is defined as follows:
wherein Characteristic style parameter set for representing ground object category, +.>Representing the weather-sensitive factor, if the category k is a weather-sensitive ground object, w k =1, otherwise w k =0, i.e. style regularization of the feature class features of the weathersensitive terrain.
Further, the style migration process considering the heterogeneity of the ground feature comprises the following steps:
different domain spatial samples X A and XB Obtaining embedded features F via a domain encoder A and FB ,F A And a segmentation map Y thereof A Style parameters related to the category of ground objectsCombination, F B And its pseudo-segmentation map->And->Combining;
performing style regularization on the characteristics of the weathersensitive ground object categories through the category-by-category regular constraints to obtain F AB and FBA Subsequently, a style migration sample X is obtained through a domain decoder AB and XBA ;
Using an antagonistic learning training style migration network: definition G A and GB For image domain X A and XB D, D A and DB For image domain X A and XB The discriminator adopts a semantic segmentation network structure, which not only discriminates a real sample and a style migration sample, but also correctly classifies the ground objects in the real sample.
Further, the contrast loss function of the discriminator is defined as:
wherein yA Representing source domain samples x A Is used for the real segmentation map of (a),represented by M seg (initially source domain model) for target domain sample x B Predicting the obtained pseudo-segmentation map;
correspondingly, the countermeasures loss function of the generator is defined as:
by combining X AB And and XBA andGenerating cross style migration samples X ABA and XBAB Thus, cross-reconstruction consistency loss is minimized:
at the same time by combining X A And and XB andThe reconstructed sample X generated AA and XBB And original sampleThe cost should also be consistent and minimize self-rebuilding consistency loss:
the target loss function of the final generator is defined as:
further, the model bidirectional optimization in the bidirectional optimization learning mechanism includes two directions: (M) seg →M st and (Mst →M seg );Represents the mth model bi-directional optimization, +.>Representing the initial model as a source domain model, +.>Is->Generating a target domain pseudo-segmentation map;
(M seg →M st ) The optimization direction represents a style migration network which utilizes the target domain pseudo tag predicted by the semantic segmentation network to train and consider the heterogeneity of the ground object weathers;
given M seg Prediction result p in target domain B =M seg (x B ) Will set the confidence threshold d to p B Screening, the high confidence prediction results will be selected as pseudo tags for M st Is used for training;
(M st →M seg ) The optimization direction represents optimizing the semantic segmentation network by using the style migration sample; training M by using source domain data and real segmentation graph thereof seg :
Then give trained M st Style migration results for target domain samplesObtaining p B =M seg (x B) and pBA =M seg (x BA );
M seg Predictions of the target domain samples and their style migration results should remain consistent, thus minimizing the prediction consistency loss function:
at the same time for p B High confidence region max (p) B > d), minimizing the mutual learning loss function of the migration samples:
thus, the semantic segmentation domain adaptive objective loss function is defined as:
the beneficial effects of the invention are as follows:
(1) According to the method, from the heterogeneity rule of the ground feature, the self-adaptive segmentation sample standardization module (AdaSIN) is designed to conduct class-by-class regular constraint on embedded features, so that a style migration network can generate style migration samples considering the heterogeneity of the ground feature. Compared with the style migration sample generated by the traditional method, the style migration sample taking the heterogeneity of the ground feature is more advantageous in reducing the category characteristic distribution difference between different time-phase domains, is beneficial to improving the domain self-adaptive learning effect of the model, and can be popularized and applied to cross-domain scene classification, cross-domain semantic segmentation, change detection and other task scenes with multi-time-phase characteristic data.
(2) The method of the invention strengthens the information interaction between the style migration process and the semantic segmentation process due to the design of the two-way optimization learning mechanism of the style migration network and the semantic segmentation network, and further improves the domain self-adaptive learning capacity of the model.
Drawings
FIG. 1 is a schematic diagram of a style migration process taking account of the heterogeneity of the terrain features, wherein rectangular and triangular shapes represent the class of the terrain-sensitive terrain features and circular shapes represent the class of the terrain-insensitive terrain features;
FIG. 2 is bi-directional optimization algorithm pseudocode for a style migration network and a semantic segmentation network;
FIG. 3 is a semantic segmentation diagram of a different domain adaptive learning method;
FIG. 4 is a graph of a sensitivity analysis of confidence threshold parameters.
Detailed Description
The invention is further described below with reference to the accompanying drawings, without limiting the invention in any way, and any alterations or substitutions based on the teachings of the invention are intended to fall within the scope of the invention.
The framework proposed by the invention comprises a style migration network M st And a semantic segmentation network M seg . Given a label Y with split A Source phase dataset X of (1) A And a target phase data set X without a tag B The object of the invention is to train M st Generating style migration samples considering the heterogeneity of the ground object and training M by using the style migration samples seg Reducing the category characteristic distribution difference of the source domain and the target domain and simultaneously constructing M st and Mseg The bi-directional optimization learning mechanism of the system improves the self-adaptive learning capability of the cross-time-phase remote sensing image domain and completes the X-ray analysis B Semantic segmentation tasks of (a). Next, the present invention first describes how to generate style migration samples that take into account the heterogeneity of the ground feature, and then describes M in detail st and Mseg Is a two-way optimization learning mechanism of (1).
(1) Style migration in consideration of the heterogeneity of the physical characteristics of the ground
Given an image feature mapAnd the ground object segmentation map of the corresponding scale +.>Starting from the heterogeneity law of the feature climate, different feature categories in the image scene should have different style characteristics. Thus, for the ground object category k, a category profile +.>The invention uses the mean and variance of the class feature channel dimension to represent the feature class style of the image> wherein ,
for the invention andFeature class feature style parameter sets respectively representing different image domains, wherein beta is k ,γ k Representation mu k ,σ k Is a mathematical expectation of (a). Assume that for feature class k, the image domain contains N in total k Sampling pixels, then
Such a calculation method is too much in computational resources and is unfavorable for model training. Therefore, the invention randomly selects a part of samples to initialize beta by using the formula (2) k ,γ k Gradually updating beta using moving averages during model training k ,γ k :
β k ←λβ k +(1-λ)μ k (F)
γ k ←λγ k +(1-λ)σ k (F) (3)
Where λ is the momentum coefficient, set to 0.9999. Therefore, the invention can perform category-by-category regular constraint on embedded features through adaptive segmentation sample normalization (Adaptive Segmented Instance Normalization, adaSIN), and is defined as follows:
wherein Characteristic style parameter set for representing ground object category, +.>Representing the weather-sensitive factor, if the category k is a weather-sensitive ground object, w k =1, otherwise w k =0, i.e. style regularization of the feature class features of the weathersensitive terrain.
The process of style migration taking into account the heterogeneity of the terrain features is described as follows: as shown in fig. 1, different domain spatial samples X A and XB Obtaining embedded features F via a domain encoder A and FB ,F A And a segmentation map Y thereof A Style parameters related to the category of ground objectsCombination, F B And its pseudo-segmentation map->And->Combining, and regularizing the style of the category characteristics of the weather-sensitive ground object through a formula (4) to obtain F AB and FBA Subsequently, a style migration sample X is obtained through a domain decoder AB and XBA . To ensure X A Migration with style sample X BA and XB Migration with style sample X AB The present invention uses an antagonistic learning training style migration network where the feature class profile is as close as possible. Definition G A and GB For image domain X A and XB D, D A and DB For image domain X A and XB Is a discriminator of the above (a). The discriminator adopts a semantic segmentation network structure, which is required to judge a real sample and a style migration sample and accurately classify the ground objects in the real sample. Thus, the contrast loss function of the discriminator is defined as:
wherein yA Representing source domain samples x A Is used for the real segmentation map of (a),represented by M seg (initially source domain model) for target domain sample x B And predicting the obtained pseudo-segmentation map. Correspondingly, the reactive loss function of the generator is defined as: />
To ensure X after style migration A And X is AB ,X B And X is BA Maintaining semantic consistency, the invention first uses the combination X AB And and XBA andGenerating cross style migration samples X ABA and XBAB . Thus, the present invention minimizes cross-reconstruction consistency loss:
meanwhile, the invention is realized by combining X A And and XB andThe reconstructed sample X generated AA and XBB Should also stay consistent with the original samples and minimize self-reconstruction consistency loss:
thus, the target loss function of the final generator is defined as:
(2) Two-way optimized learning mechanism
Model bi-directional optimization includes two directions: (M) seg →M st) and (Mst →M seg )。 Representing the mth model bi-directional optimization, detailed optimization procedure is shown in FIG. 2, < >>Representing the initial model as a source domain model, +.>Is->And generating a target domain pseudo-segmentation map.
(M seg →M st ) Optimization direction represents target domain artifacts predicted using semantic segmentation networksThe label trains a style migration network taking the heterogeneity of the ground feature weathers into consideration. Given M seg Prediction result p in target domain B =M seg (x B ) The invention sets a confidence threshold value d to p B Screening, the high confidence prediction results will be selected as pseudo tags for M st Is a training of (a). For a target pixelIts pseudo tag->Expressed as:
(M st →M seg ) The optimization direction represents optimizing the semantic segmentation network using the style migration samples. The invention firstly trains M by utilizing the source domain data and the real segmentation map thereof seg :
Then give trained M st Style migration results for target domain samplesThe invention can obtain p B =M seg (x B) and pBA =M seg (x BA )。M seg The predictions of the target domain samples and their style migration results should remain consistent, thus, the present invention minimizes the prediction consistency loss function:
at the same time for p B High confidence region max (p) B > d), the present invention minimizes migrationSample-shifted mutual learning loss function:
thus, the semantic segmentation domain adaptive objective loss function is defined as:
in order to verify the effectiveness of the method, the method and two typical domain adaptive learning technologies perform cross-time-phase semantic segmentation comparison experiments on the same data set, namely a domain adaptive learning method (CBST) based on self-training and a domain adaptive learning method (DAugnet) using style migration samples. The invention uses the Overall Accuracy (OA), kappa coefficient (Kappa) and weighted cross-over ratio (FWIOU) as the overall classification evaluation index and cross-over ratio (IoU) as the individual classification evaluation index. From the quantitative index (see table 1), compared with the standard, the CBST has the worst lifting effect on cross-time phase semantic segmentation, and the method has the best lifting effect. Compared with the domain self-adaptive learning method DAugnet using the traditional style migration sample, the method of the invention uses the style migration sample considering the heterogeneity of the ground feature to carry out domain self-adaptive learning, and the overall indexes OA, kappa and FWIOU are respectively 3.58 percent,
5.35% and 5.71% improvement. From the qualitative results (see fig. 3), the conventional domain self-adaptive learning method is easier to confuse cultivated land and water, but the method reduces the category characteristic distribution difference between different time domains by using the style migration sample considering the geographic heterogeneity, and builds a bidirectional optimization learning mechanism of the style migration network and the semantic segmentation network, so that the classification confusion of the cultivated land and the water is obviously reduced, which proves that the method has obvious advantages for distinguishing the geographic sensitive (cultivated land) and the geographic insensitive (water).
Table 1 semantic segmentation contrast experiment results (%)
In order to verify the feasibility of the method, a group of multi-temporal remote sensing image data of a Hunan pool city area in Hunan province of China is selected for experiments, the data is sampled from a GF-2 sensor, the resolution is 2m, a source domain data set is sampled from 2018, and a target domain data set is sampled from 2019. The source and target fields each contain 4232 remote sensing images (512 x 512 pixels in size) that contain class 6 clutter tags, i.e., null, cultivated land, woodland, grassland, water, and artificial earth surfaces. The source domain image, the label and the target domain image are used for domain self-adaptive model training, and the target domain label is used for domain self-adaptive model testing. The method of the invention is compared with two typical domain self-adaptive learning methods, and the experimental results are shown in fig. 3 and table 1, and the method of the invention shows more outstanding domain self-adaptive learning capability. In addition, three problems are discussed herein: (1) The contribution of style migration samples considering the heterogeneity of the ground feature weathers to the semantic segmentation of the cross-time-domain remote sensing images is considered; (2) The style migration network and the semantic segmentation network act as a bidirectional optimization learning mechanism; (3) Influence of confidence threshold on model optimization in the generation process of the target pseudo tag.
Table 2 the ablation experimental results (%) of each component of the method of the present invention, wherein ST represents pseudo-tag self-learning, adaIN represents adaptive sample normalization, adaSIN represents adaptive segmentation sample normalization proposed by the present invention, and m represents bi-directional optimization learning times
First, as can be seen from the ablation experimental results (see table 2), compared with the pseudo-label self-learning mode (ST), the effect of introducing the style migration sample in the domain self-adaptive learning process is improved more obviously. However, the traditional style migration method adopts the style migration sample generated by the adaptive sample normalization (Adaptive Instance Normalization, adaIN) to only consider reducing the image radiation difference but neglecting the heterogeneity of the ground feature, so that the domain adaptive learning capability of the model is limited. According to the method, from the ground feature weather heterogeneity rule, the self-adaptive segmentation sample standardization (Adaptive Segmented Instance Normalization, adaSIN) is designed to conduct class-by-class regular constraint on embedded features, so that a style migration network can generate a style migration sample considering the ground feature weather heterogeneity. Compared with AdaIN, the AdaSIN provided by the method of the invention improves the domain self-adaptive learning capacity of the model, so that various indexes of experimental results are obviously improved.
Secondly, the method takes the output of the semantic segmentation network as pseudo tag information to be input into the style migration network, and builds a two-way optimization learning mechanism of the style migration network and the semantic segmentation network. From table 2, it can be seen that when the number of bidirectional optimization learning times m >1, each time bidirectional optimization is performed, the experimental result is further improved, which illustrates that the bidirectional optimization learning mechanism enhances the information interaction of the style migration process and the domain adaptive learning process, and further improves the domain adaptive learning capability of the model.
Finally, in order to study the influence of the confidence threshold d on model optimization in the pseudo tag generation process, a series of model optimization experiments are carried out by setting d within the range of [0.4,0.8 ]. As shown in fig. 4, the model achieves the best performance when d is set to 0.7. When d is set to be too small (for example, less than 0.5), false labels have more error information and the model performance is weaker; at d between 0.6 and 0.7, the effect on model optimization is not significant; when d is set too large (e.g., greater than 0.7), there is a slight degradation in model performance, and the use of a larger d results in a possible degradation in performance because it reduces the number of pseudo tags that it generates, limiting model retraining.
The beneficial effects of the invention are as follows:
(1) According to the method, from the heterogeneity rule of the ground feature, the self-adaptive segmentation sample standardization module (AdaSIN) is designed to conduct class-by-class regular constraint on embedded features, so that a style migration network can generate style migration samples considering the heterogeneity of the ground feature. Compared with the style migration sample generated by the traditional method, the style migration sample taking the heterogeneity of the ground feature is more advantageous in reducing the category characteristic distribution difference between different time-phase domains, is beneficial to improving the domain self-adaptive learning effect of the model, and can be popularized and applied to cross-domain scene classification, cross-domain semantic segmentation, change detection and other task scenes with multi-time-phase characteristic data.
(2) The method of the invention strengthens the information interaction between the style migration process and the semantic segmentation process due to the design of the two-way optimization learning mechanism of the style migration network and the semantic segmentation network, and further improves the domain self-adaptive learning capacity of the model.
The word "preferred" is used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as "preferred" is not necessarily to be construed as advantageous over other aspects or designs. Rather, use of the word "preferred" is intended to present concepts in a concrete fashion. The term "or" as used in this application is intended to mean an inclusive "or" rather than an exclusive "or". That is, unless specified otherwise or clear from the context, "X uses a or B" is intended to naturally include any of the permutations. That is, if X uses A; x is B; or X uses both A and B, then "X uses A or B" is satisfied in any of the foregoing examples.
Moreover, although the disclosure has been shown and described with respect to one or more implementations, equivalent alterations and modifications will occur to others skilled in the art based upon a reading and understanding of this specification and the annexed drawings. The present disclosure includes all such modifications and alterations and is limited only by the scope of the following claims. In particular regard to the various functions performed by the above described components (e.g., elements, etc.), the terms used to describe such components are intended to correspond, unless otherwise indicated, to any component which performs the specified function of the described component (e.g., that is functionally equivalent), even though not structurally equivalent to the disclosed structure which performs the function in the herein illustrated exemplary implementations of the disclosure. Furthermore, while a particular feature of the disclosure may have been disclosed with respect to only one of several implementations, such feature may be combined with one or other features of the other implementations as may be desired and advantageous for a given or particular application. Moreover, to the extent that the terms "includes," has, "" contains, "or variants thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term" comprising.
The functional units in the embodiment of the invention can be integrated in one processing module, or each unit can exist alone physically, or a plurality of or more than one unit can be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product. The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. The above-mentioned devices or systems may perform the storage methods in the corresponding method embodiments.
In summary, the foregoing embodiment is an implementation of the present invention, but the implementation of the present invention is not limited to the embodiment, and any other changes, modifications, substitutions, combinations, and simplifications made by the spirit and principles of the present invention should be equivalent to the substitution manner, and all the changes, modifications, substitutions, combinations, and simplifications are included in the protection scope of the present invention.
Claims (5)
1. A cross-domain remote sensing image self-adaptive learning method considering the heterogeneity of ground feature is characterized in that the method is applied to a style migration network M st And semantic segmentation network M seg The method comprises the following steps:
given a label Y with split A Source phase dataset X of (1) A And a target phase data set X without a tag B Training M st Generating a style migration sample considering the heterogeneity of the ground feature weathers;
training M using the style migration samples seg Reducing the category characteristic distribution difference of the source domain and the target domain and simultaneously constructing M st and Mseg The bi-directional optimization learning mechanism of the system improves the self-adaptive learning capability of the cross-time-phase remote sensing image domain and completes the learning of the cross-time-phase remote sensing image domainX B Semantic segmentation tasks of (2);
the model bi-directional optimization in the bi-directional optimization learning mechanism includes two directions: (M) seg →M st) and (Mst →M seg );Represents the mth model bi-directional optimization, +.>The representation initial model is a source domain model,is->Generating a target domain pseudo-segmentation map; (M) seg →M st ) The optimization direction represents a style migration network (M) for training and considering the heterogeneity of the feature of the ground object by utilizing the target domain pseudo tag predicted by the semantic segmentation network st →M seg ) The optimization direction represents optimizing the semantic segmentation network using the style migration samples.
2. The method for adaptively learning cross-domain remote sensing images taking account of heterogeneity of ground feature according to claim 1, wherein an image feature map is givenAnd the ground object segmentation map of the corresponding scale +.>For the ground object category k, a category characteristic diagram is obtained>Representing the feature class style of the image by means of the mean and variance of the class feature channel dimensions> wherein ,
by using andFeature class feature style parameter sets respectively representing different image domains, wherein mu k Mean value, sigma k Representing variance, beta k ,γ k Representation mu k ,σ k Is a mathematical expectation of (a); assume that for feature class k, the image domain contains N in total k Sampling pixels, then
A part of the samples are randomly selected and utilized to initialize beta k ,γ k Gradually updating beta using moving averages during model training k ,γ k :
β k ←λβ k +(1-λ)μ k (F)
γ k ←λγ k +(1-λ)σ k (F)
Where λ is the momentum coefficient, set to 0.9999;
the embedded features are subjected to category-by-category regular constraint through adaptive segmentation sample normalization, and the method is defined as follows:
wherein Characteristic style parameter set for representing ground object category, +.>Representing the weather-sensitive factor, if the category k is a weather-sensitive ground object, w k =1, otherwise w k =0, i.e. style regularization of the feature class features of the weathersensitive terrain.
3. The cross-domain remote sensing image adaptive learning method considering the heterogeneity of the ground feature according to claim 2, wherein the style migration process considering the heterogeneity of the ground feature comprises:
different domain spatial samples X A and XB Obtaining embedded features F via a domain encoder A and FB ,F A And a segmentation map Y thereof A Style parameters related to the category of ground objectsCombination, F B And its pseudo-segmentation map->And->Combining;
performing style regularization on the characteristics of the weathersensitive ground object categories through the category-by-category regular constraints to obtain F AB and FBA Subsequently, a style migration sample X is obtained through a domain decoder AB and XBA ;
Using an antagonistic learning training style migration network: definition G A and GB For image domain X A and XB D, D A and DB For image domain X A and XB The discriminator adopts a semantic segmentation network structure, which not only discriminates a real sample and a style migration sample, but also correctly classifies the ground objects in the real sample.
4. The method for adaptively learning a cross-domain remote sensing image taking into account the heterogeneity of a feature according to claim 3, wherein the contrast loss function of the discriminator is defined as:
wherein yA Representing source domain samples x A Is used for the real segmentation map of (a),represented by M seg For target domain sample x B Predicting the obtained pseudo-segmentation map;
correspondingly, the countermeasures loss function of the generator is defined as:
by combining X AB And and XBA andGenerating cross style migration samples X ABA and XBAB Thus, cross-reconstruction consistency loss is minimized:
at the same time by combining X A And and XB andThe reconstructed sample X generated AA and XBB Should also stay consistent with the original samples and minimize self-reconstruction consistency loss:
the target loss function of the final generator is defined as:
5. the method of claim 4, wherein the cross-domain remote sensing image adaptive learning method is characterized by, for (M seg →M st ) Optimization direction:
given M seg Prediction result p in target domain B =M seg (x B ) Setting confidence threshold d to p B Screening, the high confidence prediction results are selected as pseudo tags for M st Is used for training;
for (M) st →M seg ) Optimization direction:
training M by using source domain data and real segmentation graph thereof seg :
Given trained M st Style migration results for target domain samplesObtaining p B =M seg (x B) and pBA =M seg (x BA );
M seg Predictions of the target domain samples and their style migration results should remain consistent, thus minimizing the prediction consistency loss function:
at the same time for p B High confidence region max (p) B >d) Minimizing the mutual learning loss function of the migration samples:
thus, the semantic segmentation domain adaptive objective loss function is defined as:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310258590.7A CN116152666A (en) | 2023-03-17 | 2023-03-17 | Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310258590.7A CN116152666A (en) | 2023-03-17 | 2023-03-17 | Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116152666A true CN116152666A (en) | 2023-05-23 |
Family
ID=86360159
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310258590.7A Pending CN116152666A (en) | 2023-03-17 | 2023-03-17 | Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116152666A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116561536A (en) * | 2023-07-11 | 2023-08-08 | 中南大学 | Landslide hidden danger identification method, terminal equipment and medium |
CN116935242A (en) * | 2023-07-24 | 2023-10-24 | 哈尔滨工业大学 | Remote sensing image semantic segmentation method and system based on space and semantic consistency contrast learning |
-
2023
- 2023-03-17 CN CN202310258590.7A patent/CN116152666A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116561536A (en) * | 2023-07-11 | 2023-08-08 | 中南大学 | Landslide hidden danger identification method, terminal equipment and medium |
CN116561536B (en) * | 2023-07-11 | 2023-11-21 | 中南大学 | Landslide hidden danger identification method, terminal equipment and medium |
CN116935242A (en) * | 2023-07-24 | 2023-10-24 | 哈尔滨工业大学 | Remote sensing image semantic segmentation method and system based on space and semantic consistency contrast learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Estimating tropical cyclone intensity by satellite imagery utilizing convolutional neural networks | |
CN111383192B (en) | Visible light remote sensing image defogging method fusing SAR | |
Wang et al. | Urban impervious surface detection from remote sensing images: A review of the methods and challenges | |
Alqurashi et al. | An assessment of the impact of urbanization and land use changes in the fast-growing cities of Saudi Arabia | |
CN116152666A (en) | Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity | |
Chen et al. | A landslide extraction method of channel attention mechanism U-Net network based on Sentinel-2A remote sensing images | |
CN110097101A (en) | A kind of remote sensing image fusion and seashore method of tape sorting based on improvement reliability factor | |
Veillette et al. | Creating synthetic radar imagery using convolutional neural networks | |
Li et al. | Extracting check dam areas from high‐resolution imagery based on the integration of object‐based image analysis and deep learning | |
Feng et al. | Impervious surface extraction based on different methods from multiple spatial resolution images: a comprehensive comparison | |
Ghosh et al. | Mapping of debris-covered glaciers in parts of the Greater Himalaya Range, Ladakh, western Himalaya, using remote sensing and GIS | |
Kan et al. | Snow Cover Mapping for Mountainous Areas by Fusion of MODIS L1B and Geographic Data Based on Stacked Denoising Auto-Encoders. | |
CN115147727A (en) | Method and system for extracting impervious surface of remote sensing image | |
Lu et al. | Multiscale superpixel-based active learning for hyperspectral image classification | |
Ashesh et al. | Accurate and clear quantitative precipitation nowcasting based on a deep learning model with consecutive attention and rain-map discrimination | |
Matsuoka et al. | Automatic detection of stationary fronts around Japan using a deep convolutional neural network | |
Yao et al. | Cloud detection in optical remote sensing images with deep semi-supervised and active learning | |
Ren et al. | Cycle-consistent adversarial networks for realistic pervasive change generation in remote sensing imagery | |
Li et al. | Change detection in synthetic aperture radar images based on log-mean operator and stacked auto-encoder | |
Lyu et al. | A deep information based transfer learning method to detect annual urban dynamics of Beijing and Newyork from 1984–2016 | |
Hao et al. | A subpixel mapping method for urban land use by reducing shadow effects | |
CN113642663B (en) | Satellite remote sensing image water body extraction method | |
Menze et al. | Multitemporal fusion for the detection of static spatial patterns in multispectral satellite images—With application to archaeological survey | |
Guo et al. | Abandoned terrace recognition based on deep learning and change detection on the Loess Plateau in China | |
Scharenbroich et al. | A Bayesian framework for storm tracking using a hidden-state representation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |