CN116152666A - Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity - Google Patents

Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity Download PDF

Info

Publication number
CN116152666A
CN116152666A CN202310258590.7A CN202310258590A CN116152666A CN 116152666 A CN116152666 A CN 116152666A CN 202310258590 A CN202310258590 A CN 202310258590A CN 116152666 A CN116152666 A CN 116152666A
Authority
CN
China
Prior art keywords
domain
seg
style
style migration
heterogeneity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310258590.7A
Other languages
Chinese (zh)
Inventor
陶超
王昊
李海峰
孙燕涛
�云杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
INNER MONGOLIA AEROSPACE POWER MACHINERY TESTING INSTITUTE
Central South University
Original Assignee
INNER MONGOLIA AEROSPACE POWER MACHINERY TESTING INSTITUTE
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by INNER MONGOLIA AEROSPACE POWER MACHINERY TESTING INSTITUTE, Central South University filed Critical INNER MONGOLIA AEROSPACE POWER MACHINERY TESTING INSTITUTE
Priority to CN202310258590.7A priority Critical patent/CN116152666A/en
Publication of CN116152666A publication Critical patent/CN116152666A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a cross-domain remote sensing image self-adaptive learning method considering the heterogeneity of ground feature weathers, which comprises the following steps: given a label Y with split A Source phase dataset X of (1) A And a target phase data set X without a tag B Training M st Generating a style migration sample considering the heterogeneity of the ground feature weathers; training M using style migration samples seg Reducing the category characteristic distribution difference of the source domain and the target domain and simultaneously constructing M st And M seg Is a bidirectional optimized learning mechanism for improving cross-time-phase remote sensingImage domain self-adaptive learning capability to complete X B Semantic segmentation tasks of (a). The method has the advantages of reducing the category characteristic distribution difference between different time domains and improving the domain self-adaptive learning effect of the model; the information interaction of the style migration process and the semantic segmentation process is enhanced.

Description

Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity
Technical Field
The invention belongs to the technical field of domain self-adaptive learning, and particularly relates to a cross-domain remote sensing image self-adaptive learning method considering the heterogeneity of ground feature weathers.
Background
The remote sensing technology has become an important means for surface coverage classification research by virtue of the characteristics of wide coverage, high space-time resolution, large information quantity and the like. Under the condition of global economy integration, the scale range of the earth surface coverage classification is not limited to the local area of the country, but is expanded to the regional or global scale, and the traditional manual interpretation classification method has large workload and low updating speed, and can not meet the requirements of modern map making. In recent years, the development of remote sensing image interpretation intellectualization has been advanced by the progress of artificial intelligence technology such as deep learning, and a large number of data-driven remote sensing image interpretation methods based on deep learning have emerged.
However, the good performance of existing various data-driven remote sensing image deep learning models requires that the test data and the training data satisfy independent co-distribution assumptions. In a real application scene of multi-temporal remote sensing image classification, marked training data (source domain) and unmarked test data (target domain) are often derived from different data distribution, and have obvious visual style differences, so that the performance of a source domain model in the target domain is poor. In order to solve the problem, in recent years, a lot of research work is devoted to learning the mapping of different image domains by using a deep neural network, and the motivation is to migrate the style of the remote sensing image of a certain domain to another specific domain, so that the generated migration sample is closer to the specific domain sample in visual style, thereby supporting the cross-time-phase remote sensing image domain adaptive learning. Yang et al, in non-patent literature FG-GAN: AFine-Grained Generative Adversarial Network for Unsupervised SAR-to-Optical Image Translation, IEEE Trans. Geosci. Remote Sensing, vol.60, pp.1-11,2022, doi:10.1109/TGRS.2022.3165371, propose integrating dense connection modules and residual modules in a generation network and enhancing the ability of a style migration model to characterize radiation characteristics of a remote Sensing image using a multi-scale discrimination network. Tasar et al in the non-patent literature, "DAugNet: insuperved, multisource, multitarget, and Life-Long Domain Adaptation for Semantic Segmentation of Satellite Images," IEEE Trans. Geosci. Remote Sensing, vol.59, no.2, pp.1067-1081, feb.2021, doi:10.1109/TGRS.2020.3006161 "describe the style characteristics of an image using statistics of each channel of the image characteristics, and simply adjust the mean and variance of each channel of the input image characteristics to match the style characteristics of the target remote Sensing image by adaptive instance normalization. Zhang et al in non-patent document "Remote Sensing Image Translation via Style-Based Recalibration Module and Improved Style Discriminator," IEEE geoci. Remote Sensing Lett., vol.19, pp.1-5,2022, doi: 10.1109/LGRS.2021.306858 "introduce a style-based feature alignment module that assigns learning weights to each channel of image features based on the importance of statistics of each channel of image features to style migration of remote Sensing images, so that the style migration network can obtain style information requiring focus attention in remote Sensing images more quickly.
However, the current technical methods for performing cross-temporal remote sensing image domain adaptive learning by using style migration samples implicitly assume that the visual style changes of all ground object targets in a remote sensing scene are isotropic, and mainly focus on reducing image radiation differences generated in an external imaging process of the remote sensing scene, and neglect the excavation of the influence factors of the differences of the ground objects in the scene. The influence of the difference of the physical condition of the ground object on the visual style change of the remote sensing image is mainly reflected in two aspects. On the one hand, compared with the ground features such as the artificial ground surface and the like which are insensitive to the physical conditions, the ground features which are sensitive to the physical conditions are particularly easy to be affected by the seasonal circulation to generate morphological changes, such as germination, leaf expansion, leaf discoloration, leaf shedding and the like of plants. On the other hand, the climatic laws of different features have heterogeneity [3] For example, the growing season of forests is relatively long, and the climate change generally occurs only once a year, while the growing season of farmland crops is short, and the climate change of farmlands which are twice a year or even three times a year can occur for a plurality of times. Therefore, when the technical methods are used for identifying the target remote sensing image of the geographical area with serious spectrum mixing of the ground objects and relatively broken ecological landscapes, the identification result is difficult to distinguish the boundary of the weather-sensitive ground objects, and meanwhile, the weather-sensitive ground objects (such as farmlands) and weather-insensitive ground objects (such as artificial ground surfaces) are easily confused.
According to the background of the current technical research, for the technical method for performing cross-time-phase remote sensing image domain self-adaptive learning by using style migration samples, the following problems mainly exist to be solved: (1) The visual style change of the remote sensing scene is influenced by the image radiation difference generated in the external imaging process and the physical difference of ground features in the scene, however, the style migration sample generated in the prior art cannot simulate the heterogeneity of the ground feature, and the domain self-adaptive classification effect is hindered. (2) The style migration network cannot interact with the semantic segmentation network in the feature learning process, so that the information transferred by the style migration sample to the domain self-adaptive learning process is insufficient, and the domain self-adaptive learning capacity of the model is limited.
Disclosure of Invention
In view of this, the framework proposed by the present invention comprises a style migration network M st And a semantic segmentation network M seg . Given a label Y with split A Source phase dataset X of (1) A And a target phase data set X without a tag B The goal is to train M st Generating style migration samples considering the heterogeneity of the ground object and training M by using the style migration samples seg Reducing the category characteristic distribution difference of the source domain and the target domain and simultaneously constructing M st and Mseg The bi-directional optimization learning mechanism of the system improves the self-adaptive learning capability of the cross-time-phase remote sensing image domain and completes the X-ray analysis B Semantic segmentation tasks of (a).
The invention discloses a cross-domain remote sensing image self-adaptive learning method considering the heterogeneity of ground feature and physical features, which is appliedMigrating a network M in a style st And a semantic segmentation network M seg The method comprises the following steps:
given a label Y with split A Source phase dataset X of (1) A And a target phase data set X without a tag B Training M st Generating a style migration sample considering the heterogeneity of the ground feature weathers;
training M using the style migration samples seg Reducing the category characteristic distribution difference of the source domain and the target domain and simultaneously constructing M st and Mseg The bi-directional optimization learning mechanism of the system improves the self-adaptive learning capability of the cross-time-phase remote sensing image domain and completes the X-ray analysis B Semantic segmentation tasks of (a).
Further, given an image feature map
Figure BDA0004130436440000041
And a ground object segmentation map with corresponding scale
Figure BDA0004130436440000042
For the ground object category k, a category characteristic diagram is obtained>
Figure BDA0004130436440000043
Representing the feature class style of the image by means of the mean and variance of the class feature channel dimensions>
Figure BDA0004130436440000044
wherein ,
Figure BDA0004130436440000045
Figure BDA0004130436440000046
by using
Figure BDA0004130436440000047
and
Figure BDA0004130436440000048
Feature class feature style parameter sets respectively representing different image domains, wherein mu k Mean value, sigma k Representing variance, beta k ,γ k Representation mu k ,σ k Is a mathematical expectation of (a); assume that for feature class k, the image domain contains N in total k Sampling pixels, then
Figure BDA0004130436440000049
Figure BDA00041304364400000410
A part of samples are randomly selected to initialize beta by using a formula (2) k ,γ k Gradually updating mu with moving average during model training k ,γ k
β k ←λβ k +(1-λ)μ k (F)
γ k ←λγ k +(1-λ)σ k (F)
Where λ is the momentum coefficient, set to 0.9999;
the embedded features are subjected to category-by-category regular constraint through adaptive segmentation sample normalization, and the method is defined as follows:
Figure BDA00041304364400000411
wherein
Figure BDA00041304364400000412
Characteristic style parameter set for representing ground object category, +.>
Figure BDA00041304364400000413
Representing the weather-sensitive factor, if the category k is a weather-sensitive ground object, w k =1, otherwise w k =0, i.e. style regularization of the feature class features of the weathersensitive terrain.
Further, the style migration process considering the heterogeneity of the ground feature comprises the following steps:
different domain spatial samples X A and XB Obtaining embedded features F via a domain encoder A and FB ,F A And a segmentation map Y thereof A Style parameters related to the category of ground objects
Figure BDA0004130436440000051
Combination, F B And its pseudo-segmentation map->
Figure BDA0004130436440000052
And->
Figure BDA0004130436440000053
Combining;
performing style regularization on the characteristics of the weathersensitive ground object categories through the category-by-category regular constraints to obtain F AB and FBA Subsequently, a style migration sample X is obtained through a domain decoder AB and XBA
Using an antagonistic learning training style migration network: definition G A and GB For image domain X A and XB D, D A and DB For image domain X A and XB The discriminator adopts a semantic segmentation network structure, which not only discriminates a real sample and a style migration sample, but also correctly classifies the ground objects in the real sample.
Further, the contrast loss function of the discriminator is defined as:
Figure BDA0004130436440000054
Figure BDA0004130436440000055
wherein yA Representing source domain samples x A Is used for the real segmentation map of (a),
Figure BDA0004130436440000056
represented by M seg (initially source domain model) for target domain sample x B Predicting the obtained pseudo-segmentation map;
correspondingly, the countermeasures loss function of the generator is defined as:
Figure BDA0004130436440000057
Figure BDA0004130436440000058
Figure BDA0004130436440000059
by combining X AB And
Figure BDA00041304364400000510
and XBA and
Figure BDA00041304364400000511
Generating cross style migration samples X ABA and XBAB Thus, cross-reconstruction consistency loss is minimized:
Figure BDA00041304364400000512
at the same time by combining X A And
Figure BDA00041304364400000513
and XB and
Figure BDA00041304364400000514
The reconstructed sample X generated AA and XBB And original sampleThe cost should also be consistent and minimize self-rebuilding consistency loss:
Figure BDA0004130436440000061
the target loss function of the final generator is defined as:
Figure BDA0004130436440000062
further, the model bidirectional optimization in the bidirectional optimization learning mechanism includes two directions: (M) seg →M st and (Mst →M seg );
Figure BDA0004130436440000063
Represents the mth model bi-directional optimization, +.>
Figure BDA0004130436440000064
Representing the initial model as a source domain model, +.>
Figure BDA0004130436440000065
Is->
Figure BDA0004130436440000066
Generating a target domain pseudo-segmentation map;
(M seg →M st ) The optimization direction represents a style migration network which utilizes the target domain pseudo tag predicted by the semantic segmentation network to train and consider the heterogeneity of the ground object weathers;
given M seg Prediction result p in target domain B =M seg (x B ) Will set the confidence threshold d to p B Screening, the high confidence prediction results will be selected as pseudo tags for M st Is used for training;
for a target pixel
Figure BDA0004130436440000067
Its pseudo tag->
Figure BDA0004130436440000068
Expressed as:
Figure BDA0004130436440000069
(M st →M seg ) The optimization direction represents optimizing the semantic segmentation network by using the style migration sample; training M by using source domain data and real segmentation graph thereof seg
Figure BDA00041304364400000610
Then give trained M st Style migration results for target domain samples
Figure BDA00041304364400000611
Obtaining p B =M seg (x B) and pBA =M seg (x BA );
M seg Predictions of the target domain samples and their style migration results should remain consistent, thus minimizing the prediction consistency loss function:
Figure BDA00041304364400000612
at the same time for p B High confidence region max (p) B > d), minimizing the mutual learning loss function of the migration samples:
Figure BDA0004130436440000071
thus, the semantic segmentation domain adaptive objective loss function is defined as:
Figure BDA0004130436440000072
the beneficial effects of the invention are as follows:
(1) According to the method, from the heterogeneity rule of the ground feature, the self-adaptive segmentation sample standardization module (AdaSIN) is designed to conduct class-by-class regular constraint on embedded features, so that a style migration network can generate style migration samples considering the heterogeneity of the ground feature. Compared with the style migration sample generated by the traditional method, the style migration sample taking the heterogeneity of the ground feature is more advantageous in reducing the category characteristic distribution difference between different time-phase domains, is beneficial to improving the domain self-adaptive learning effect of the model, and can be popularized and applied to cross-domain scene classification, cross-domain semantic segmentation, change detection and other task scenes with multi-time-phase characteristic data.
(2) The method of the invention strengthens the information interaction between the style migration process and the semantic segmentation process due to the design of the two-way optimization learning mechanism of the style migration network and the semantic segmentation network, and further improves the domain self-adaptive learning capacity of the model.
Drawings
FIG. 1 is a schematic diagram of a style migration process taking account of the heterogeneity of the terrain features, wherein rectangular and triangular shapes represent the class of the terrain-sensitive terrain features and circular shapes represent the class of the terrain-insensitive terrain features;
FIG. 2 is bi-directional optimization algorithm pseudocode for a style migration network and a semantic segmentation network;
FIG. 3 is a semantic segmentation diagram of a different domain adaptive learning method;
FIG. 4 is a graph of a sensitivity analysis of confidence threshold parameters.
Detailed Description
The invention is further described below with reference to the accompanying drawings, without limiting the invention in any way, and any alterations or substitutions based on the teachings of the invention are intended to fall within the scope of the invention.
The framework proposed by the invention comprises a style migration network M st And a semantic segmentation network M seg . Given a label Y with split A Source phase dataset X of (1) A And a target phase data set X without a tag B The object of the invention is to train M st Generating style migration samples considering the heterogeneity of the ground object and training M by using the style migration samples seg Reducing the category characteristic distribution difference of the source domain and the target domain and simultaneously constructing M st and Mseg The bi-directional optimization learning mechanism of the system improves the self-adaptive learning capability of the cross-time-phase remote sensing image domain and completes the X-ray analysis B Semantic segmentation tasks of (a). Next, the present invention first describes how to generate style migration samples that take into account the heterogeneity of the ground feature, and then describes M in detail st and Mseg Is a two-way optimization learning mechanism of (1).
(1) Style migration in consideration of the heterogeneity of the physical characteristics of the ground
Given an image feature map
Figure BDA0004130436440000081
And the ground object segmentation map of the corresponding scale +.>
Figure BDA0004130436440000082
Starting from the heterogeneity law of the feature climate, different feature categories in the image scene should have different style characteristics. Thus, for the ground object category k, a category profile +.>
Figure BDA0004130436440000083
The invention uses the mean and variance of the class feature channel dimension to represent the feature class style of the image>
Figure BDA0004130436440000084
wherein ,
Figure BDA0004130436440000085
Figure BDA0004130436440000086
for the invention
Figure BDA0004130436440000087
and
Figure BDA0004130436440000088
Feature class feature style parameter sets respectively representing different image domains, wherein beta is k ,γ k Representation mu k ,σ k Is a mathematical expectation of (a). Assume that for feature class k, the image domain contains N in total k Sampling pixels, then
Figure BDA0004130436440000089
Figure BDA0004130436440000091
Such a calculation method is too much in computational resources and is unfavorable for model training. Therefore, the invention randomly selects a part of samples to initialize beta by using the formula (2) k ,γ k Gradually updating beta using moving averages during model training k ,γ k
β k ←λβ k +(1-λ)μ k (F)
γ k ←λγ k +(1-λ)σ k (F) (3)
Where λ is the momentum coefficient, set to 0.9999. Therefore, the invention can perform category-by-category regular constraint on embedded features through adaptive segmentation sample normalization (Adaptive Segmented Instance Normalization, adaSIN), and is defined as follows:
Figure BDA0004130436440000092
wherein
Figure BDA0004130436440000093
Characteristic style parameter set for representing ground object category, +.>
Figure BDA0004130436440000094
Representing the weather-sensitive factor, if the category k is a weather-sensitive ground object, w k =1, otherwise w k =0, i.e. style regularization of the feature class features of the weathersensitive terrain.
The process of style migration taking into account the heterogeneity of the terrain features is described as follows: as shown in fig. 1, different domain spatial samples X A and XB Obtaining embedded features F via a domain encoder A and FB ,F A And a segmentation map Y thereof A Style parameters related to the category of ground objects
Figure BDA0004130436440000095
Combination, F B And its pseudo-segmentation map->
Figure BDA0004130436440000096
And->
Figure BDA0004130436440000097
Combining, and regularizing the style of the category characteristics of the weather-sensitive ground object through a formula (4) to obtain F AB and FBA Subsequently, a style migration sample X is obtained through a domain decoder AB and XBA . To ensure X A Migration with style sample X BA and XB Migration with style sample X AB The present invention uses an antagonistic learning training style migration network where the feature class profile is as close as possible. Definition G A and GB For image domain X A and XB D, D A and DB For image domain X A and XB Is a discriminator of the above (a). The discriminator adopts a semantic segmentation network structure, which is required to judge a real sample and a style migration sample and accurately classify the ground objects in the real sample. Thus, the contrast loss function of the discriminator is defined as:
Figure BDA0004130436440000101
Figure BDA0004130436440000102
wherein yA Representing source domain samples x A Is used for the real segmentation map of (a),
Figure BDA0004130436440000103
represented by M seg (initially source domain model) for target domain sample x B And predicting the obtained pseudo-segmentation map. Correspondingly, the reactive loss function of the generator is defined as: />
Figure BDA0004130436440000104
Figure BDA0004130436440000105
Figure BDA0004130436440000106
To ensure X after style migration A And X is AB ,X B And X is BA Maintaining semantic consistency, the invention first uses the combination X AB And
Figure BDA0004130436440000107
and XBA and
Figure BDA0004130436440000108
Generating cross style migration samples X ABA and XBAB . Thus, the present invention minimizes cross-reconstruction consistency loss:
Figure BDA0004130436440000109
meanwhile, the invention is realized by combining X A And
Figure BDA00041304364400001010
and XB and
Figure BDA00041304364400001011
The reconstructed sample X generated AA and XBB Should also stay consistent with the original samples and minimize self-reconstruction consistency loss:
Figure BDA00041304364400001012
thus, the target loss function of the final generator is defined as:
Figure BDA00041304364400001013
(2) Two-way optimized learning mechanism
Model bi-directional optimization includes two directions: (M) seg →M st) and (Mst →M seg )。
Figure BDA00041304364400001014
Figure BDA00041304364400001015
Representing the mth model bi-directional optimization, detailed optimization procedure is shown in FIG. 2, < >>
Figure BDA00041304364400001016
Representing the initial model as a source domain model, +.>
Figure BDA00041304364400001017
Is->
Figure BDA00041304364400001018
And generating a target domain pseudo-segmentation map.
(M seg →M st ) Optimization direction represents target domain artifacts predicted using semantic segmentation networksThe label trains a style migration network taking the heterogeneity of the ground feature weathers into consideration. Given M seg Prediction result p in target domain B =M seg (x B ) The invention sets a confidence threshold value d to p B Screening, the high confidence prediction results will be selected as pseudo tags for M st Is a training of (a). For a target pixel
Figure BDA0004130436440000111
Its pseudo tag->
Figure BDA0004130436440000112
Expressed as:
Figure BDA0004130436440000113
(M st →M seg ) The optimization direction represents optimizing the semantic segmentation network using the style migration samples. The invention firstly trains M by utilizing the source domain data and the real segmentation map thereof seg
Figure BDA0004130436440000114
Then give trained M st Style migration results for target domain samples
Figure BDA0004130436440000115
The invention can obtain p B =M seg (x B) and pBA =M seg (x BA )。M seg The predictions of the target domain samples and their style migration results should remain consistent, thus, the present invention minimizes the prediction consistency loss function:
Figure BDA0004130436440000116
at the same time for p B High confidence region max (p) B > d), the present invention minimizes migrationSample-shifted mutual learning loss function:
Figure BDA0004130436440000117
thus, the semantic segmentation domain adaptive objective loss function is defined as:
Figure BDA0004130436440000118
in order to verify the effectiveness of the method, the method and two typical domain adaptive learning technologies perform cross-time-phase semantic segmentation comparison experiments on the same data set, namely a domain adaptive learning method (CBST) based on self-training and a domain adaptive learning method (DAugnet) using style migration samples. The invention uses the Overall Accuracy (OA), kappa coefficient (Kappa) and weighted cross-over ratio (FWIOU) as the overall classification evaluation index and cross-over ratio (IoU) as the individual classification evaluation index. From the quantitative index (see table 1), compared with the standard, the CBST has the worst lifting effect on cross-time phase semantic segmentation, and the method has the best lifting effect. Compared with the domain self-adaptive learning method DAugnet using the traditional style migration sample, the method of the invention uses the style migration sample considering the heterogeneity of the ground feature to carry out domain self-adaptive learning, and the overall indexes OA, kappa and FWIOU are respectively 3.58 percent,
5.35% and 5.71% improvement. From the qualitative results (see fig. 3), the conventional domain self-adaptive learning method is easier to confuse cultivated land and water, but the method reduces the category characteristic distribution difference between different time domains by using the style migration sample considering the geographic heterogeneity, and builds a bidirectional optimization learning mechanism of the style migration network and the semantic segmentation network, so that the classification confusion of the cultivated land and the water is obviously reduced, which proves that the method has obvious advantages for distinguishing the geographic sensitive (cultivated land) and the geographic insensitive (water).
Table 1 semantic segmentation contrast experiment results (%)
Figure BDA0004130436440000121
In order to verify the feasibility of the method, a group of multi-temporal remote sensing image data of a Hunan pool city area in Hunan province of China is selected for experiments, the data is sampled from a GF-2 sensor, the resolution is 2m, a source domain data set is sampled from 2018, and a target domain data set is sampled from 2019. The source and target fields each contain 4232 remote sensing images (512 x 512 pixels in size) that contain class 6 clutter tags, i.e., null, cultivated land, woodland, grassland, water, and artificial earth surfaces. The source domain image, the label and the target domain image are used for domain self-adaptive model training, and the target domain label is used for domain self-adaptive model testing. The method of the invention is compared with two typical domain self-adaptive learning methods, and the experimental results are shown in fig. 3 and table 1, and the method of the invention shows more outstanding domain self-adaptive learning capability. In addition, three problems are discussed herein: (1) The contribution of style migration samples considering the heterogeneity of the ground feature weathers to the semantic segmentation of the cross-time-domain remote sensing images is considered; (2) The style migration network and the semantic segmentation network act as a bidirectional optimization learning mechanism; (3) Influence of confidence threshold on model optimization in the generation process of the target pseudo tag.
Table 2 the ablation experimental results (%) of each component of the method of the present invention, wherein ST represents pseudo-tag self-learning, adaIN represents adaptive sample normalization, adaSIN represents adaptive segmentation sample normalization proposed by the present invention, and m represents bi-directional optimization learning times
Figure BDA0004130436440000131
First, as can be seen from the ablation experimental results (see table 2), compared with the pseudo-label self-learning mode (ST), the effect of introducing the style migration sample in the domain self-adaptive learning process is improved more obviously. However, the traditional style migration method adopts the style migration sample generated by the adaptive sample normalization (Adaptive Instance Normalization, adaIN) to only consider reducing the image radiation difference but neglecting the heterogeneity of the ground feature, so that the domain adaptive learning capability of the model is limited. According to the method, from the ground feature weather heterogeneity rule, the self-adaptive segmentation sample standardization (Adaptive Segmented Instance Normalization, adaSIN) is designed to conduct class-by-class regular constraint on embedded features, so that a style migration network can generate a style migration sample considering the ground feature weather heterogeneity. Compared with AdaIN, the AdaSIN provided by the method of the invention improves the domain self-adaptive learning capacity of the model, so that various indexes of experimental results are obviously improved.
Secondly, the method takes the output of the semantic segmentation network as pseudo tag information to be input into the style migration network, and builds a two-way optimization learning mechanism of the style migration network and the semantic segmentation network. From table 2, it can be seen that when the number of bidirectional optimization learning times m >1, each time bidirectional optimization is performed, the experimental result is further improved, which illustrates that the bidirectional optimization learning mechanism enhances the information interaction of the style migration process and the domain adaptive learning process, and further improves the domain adaptive learning capability of the model.
Finally, in order to study the influence of the confidence threshold d on model optimization in the pseudo tag generation process, a series of model optimization experiments are carried out by setting d within the range of [0.4,0.8 ]. As shown in fig. 4, the model achieves the best performance when d is set to 0.7. When d is set to be too small (for example, less than 0.5), false labels have more error information and the model performance is weaker; at d between 0.6 and 0.7, the effect on model optimization is not significant; when d is set too large (e.g., greater than 0.7), there is a slight degradation in model performance, and the use of a larger d results in a possible degradation in performance because it reduces the number of pseudo tags that it generates, limiting model retraining.
The beneficial effects of the invention are as follows:
(1) According to the method, from the heterogeneity rule of the ground feature, the self-adaptive segmentation sample standardization module (AdaSIN) is designed to conduct class-by-class regular constraint on embedded features, so that a style migration network can generate style migration samples considering the heterogeneity of the ground feature. Compared with the style migration sample generated by the traditional method, the style migration sample taking the heterogeneity of the ground feature is more advantageous in reducing the category characteristic distribution difference between different time-phase domains, is beneficial to improving the domain self-adaptive learning effect of the model, and can be popularized and applied to cross-domain scene classification, cross-domain semantic segmentation, change detection and other task scenes with multi-time-phase characteristic data.
(2) The method of the invention strengthens the information interaction between the style migration process and the semantic segmentation process due to the design of the two-way optimization learning mechanism of the style migration network and the semantic segmentation network, and further improves the domain self-adaptive learning capacity of the model.
The word "preferred" is used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as "preferred" is not necessarily to be construed as advantageous over other aspects or designs. Rather, use of the word "preferred" is intended to present concepts in a concrete fashion. The term "or" as used in this application is intended to mean an inclusive "or" rather than an exclusive "or". That is, unless specified otherwise or clear from the context, "X uses a or B" is intended to naturally include any of the permutations. That is, if X uses A; x is B; or X uses both A and B, then "X uses A or B" is satisfied in any of the foregoing examples.
Moreover, although the disclosure has been shown and described with respect to one or more implementations, equivalent alterations and modifications will occur to others skilled in the art based upon a reading and understanding of this specification and the annexed drawings. The present disclosure includes all such modifications and alterations and is limited only by the scope of the following claims. In particular regard to the various functions performed by the above described components (e.g., elements, etc.), the terms used to describe such components are intended to correspond, unless otherwise indicated, to any component which performs the specified function of the described component (e.g., that is functionally equivalent), even though not structurally equivalent to the disclosed structure which performs the function in the herein illustrated exemplary implementations of the disclosure. Furthermore, while a particular feature of the disclosure may have been disclosed with respect to only one of several implementations, such feature may be combined with one or other features of the other implementations as may be desired and advantageous for a given or particular application. Moreover, to the extent that the terms "includes," has, "" contains, "or variants thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term" comprising.
The functional units in the embodiment of the invention can be integrated in one processing module, or each unit can exist alone physically, or a plurality of or more than one unit can be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product. The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. The above-mentioned devices or systems may perform the storage methods in the corresponding method embodiments.
In summary, the foregoing embodiment is an implementation of the present invention, but the implementation of the present invention is not limited to the embodiment, and any other changes, modifications, substitutions, combinations, and simplifications made by the spirit and principles of the present invention should be equivalent to the substitution manner, and all the changes, modifications, substitutions, combinations, and simplifications are included in the protection scope of the present invention.

Claims (5)

1. A cross-domain remote sensing image self-adaptive learning method considering the heterogeneity of ground feature is characterized in that the method is applied to a style migration network M st And semantic segmentation network M seg The method comprises the following steps:
given a label Y with split A Source phase dataset X of (1) A And a target phase data set X without a tag B Training M st Generating a style migration sample considering the heterogeneity of the ground feature weathers;
training M using the style migration samples seg Reducing the category characteristic distribution difference of the source domain and the target domain and simultaneously constructing M st and Mseg The bi-directional optimization learning mechanism of the system improves the self-adaptive learning capability of the cross-time-phase remote sensing image domain and completes the learning of the cross-time-phase remote sensing image domainX B Semantic segmentation tasks of (2);
the model bi-directional optimization in the bi-directional optimization learning mechanism includes two directions: (M) seg →M st) and (Mst →M seg );
Figure FDA0004130436430000011
Represents the mth model bi-directional optimization, +.>
Figure FDA0004130436430000012
The representation initial model is a source domain model,
Figure FDA0004130436430000013
is->
Figure FDA0004130436430000014
Generating a target domain pseudo-segmentation map; (M) seg →M st ) The optimization direction represents a style migration network (M) for training and considering the heterogeneity of the feature of the ground object by utilizing the target domain pseudo tag predicted by the semantic segmentation network st →M seg ) The optimization direction represents optimizing the semantic segmentation network using the style migration samples.
2. The method for adaptively learning cross-domain remote sensing images taking account of heterogeneity of ground feature according to claim 1, wherein an image feature map is given
Figure FDA0004130436430000015
And the ground object segmentation map of the corresponding scale +.>
Figure FDA0004130436430000016
For the ground object category k, a category characteristic diagram is obtained>
Figure FDA0004130436430000017
Representing the feature class style of the image by means of the mean and variance of the class feature channel dimensions>
Figure FDA0004130436430000018
Figure FDA0004130436430000019
wherein ,
Figure FDA00041304364300000110
Figure FDA00041304364300000111
by using
Figure FDA00041304364300000112
and
Figure FDA00041304364300000113
Feature class feature style parameter sets respectively representing different image domains, wherein mu k Mean value, sigma k Representing variance, beta k ,γ k Representation mu k ,σ k Is a mathematical expectation of (a); assume that for feature class k, the image domain contains N in total k Sampling pixels, then
Figure FDA0004130436430000021
Figure FDA0004130436430000022
A part of the samples are randomly selected and utilized to initialize beta k ,γ k Gradually updating beta using moving averages during model training k ,γ k
β k ←λβ k +(1-λ)μ k (F)
γ k ←λγ k +(1-λ)σ k (F)
Where λ is the momentum coefficient, set to 0.9999;
the embedded features are subjected to category-by-category regular constraint through adaptive segmentation sample normalization, and the method is defined as follows:
Figure FDA0004130436430000023
wherein
Figure FDA0004130436430000024
Characteristic style parameter set for representing ground object category, +.>
Figure FDA0004130436430000025
Representing the weather-sensitive factor, if the category k is a weather-sensitive ground object, w k =1, otherwise w k =0, i.e. style regularization of the feature class features of the weathersensitive terrain.
3. The cross-domain remote sensing image adaptive learning method considering the heterogeneity of the ground feature according to claim 2, wherein the style migration process considering the heterogeneity of the ground feature comprises:
different domain spatial samples X A and XB Obtaining embedded features F via a domain encoder A and FB ,F A And a segmentation map Y thereof A Style parameters related to the category of ground objects
Figure FDA0004130436430000026
Combination, F B And its pseudo-segmentation map->
Figure FDA0004130436430000027
And->
Figure FDA0004130436430000028
Combining;
performing style regularization on the characteristics of the weathersensitive ground object categories through the category-by-category regular constraints to obtain F AB and FBA Subsequently, a style migration sample X is obtained through a domain decoder AB and XBA
Using an antagonistic learning training style migration network: definition G A and GB For image domain X A and XB D, D A and DB For image domain X A and XB The discriminator adopts a semantic segmentation network structure, which not only discriminates a real sample and a style migration sample, but also correctly classifies the ground objects in the real sample.
4. The method for adaptively learning a cross-domain remote sensing image taking into account the heterogeneity of a feature according to claim 3, wherein the contrast loss function of the discriminator is defined as:
Figure FDA0004130436430000031
Figure FDA0004130436430000032
Figure FDA0004130436430000033
wherein yA Representing source domain samples x A Is used for the real segmentation map of (a),
Figure FDA0004130436430000034
represented by M seg For target domain sample x B Predicting the obtained pseudo-segmentation map;
correspondingly, the countermeasures loss function of the generator is defined as:
Figure FDA0004130436430000035
Figure FDA0004130436430000036
Figure FDA0004130436430000037
by combining X AB And
Figure FDA0004130436430000038
and XBA and
Figure FDA0004130436430000039
Generating cross style migration samples X ABA and XBAB Thus, cross-reconstruction consistency loss is minimized:
Figure FDA00041304364300000310
at the same time by combining X A And
Figure FDA00041304364300000311
and XB and
Figure FDA00041304364300000312
The reconstructed sample X generated AA and XBB Should also stay consistent with the original samples and minimize self-reconstruction consistency loss:
Figure FDA00041304364300000313
the target loss function of the final generator is defined as:
Figure FDA00041304364300000314
5. the method of claim 4, wherein the cross-domain remote sensing image adaptive learning method is characterized by, for (M seg →M st ) Optimization direction:
given M seg Prediction result p in target domain B =M seg (x B ) Setting confidence threshold d to p B Screening, the high confidence prediction results are selected as pseudo tags for M st Is used for training;
for a target pixel
Figure FDA0004130436430000041
Its pseudo tag->
Figure FDA0004130436430000042
Expressed as:
Figure FDA0004130436430000043
for (M) st →M seg ) Optimization direction:
training M by using source domain data and real segmentation graph thereof seg
Figure FDA0004130436430000044
Given trained M st Style migration results for target domain samples
Figure FDA0004130436430000045
Obtaining p B =M seg (x B) and pBA =M seg (x BA );
M seg Predictions of the target domain samples and their style migration results should remain consistent, thus minimizing the prediction consistency loss function:
Figure FDA0004130436430000046
at the same time for p B High confidence region max (p) B >d) Minimizing the mutual learning loss function of the migration samples:
Figure FDA0004130436430000047
thus, the semantic segmentation domain adaptive objective loss function is defined as:
Figure FDA0004130436430000048
CN202310258590.7A 2023-03-17 2023-03-17 Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity Pending CN116152666A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310258590.7A CN116152666A (en) 2023-03-17 2023-03-17 Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310258590.7A CN116152666A (en) 2023-03-17 2023-03-17 Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity

Publications (1)

Publication Number Publication Date
CN116152666A true CN116152666A (en) 2023-05-23

Family

ID=86360159

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310258590.7A Pending CN116152666A (en) 2023-03-17 2023-03-17 Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity

Country Status (1)

Country Link
CN (1) CN116152666A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561536A (en) * 2023-07-11 2023-08-08 中南大学 Landslide hidden danger identification method, terminal equipment and medium
CN116935242A (en) * 2023-07-24 2023-10-24 哈尔滨工业大学 Remote sensing image semantic segmentation method and system based on space and semantic consistency contrast learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561536A (en) * 2023-07-11 2023-08-08 中南大学 Landslide hidden danger identification method, terminal equipment and medium
CN116561536B (en) * 2023-07-11 2023-11-21 中南大学 Landslide hidden danger identification method, terminal equipment and medium
CN116935242A (en) * 2023-07-24 2023-10-24 哈尔滨工业大学 Remote sensing image semantic segmentation method and system based on space and semantic consistency contrast learning

Similar Documents

Publication Publication Date Title
Chen et al. Estimating tropical cyclone intensity by satellite imagery utilizing convolutional neural networks
CN111383192B (en) Visible light remote sensing image defogging method fusing SAR
Wang et al. Urban impervious surface detection from remote sensing images: A review of the methods and challenges
Alqurashi et al. An assessment of the impact of urbanization and land use changes in the fast-growing cities of Saudi Arabia
CN116152666A (en) Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity
Chen et al. A landslide extraction method of channel attention mechanism U-Net network based on Sentinel-2A remote sensing images
CN110097101A (en) A kind of remote sensing image fusion and seashore method of tape sorting based on improvement reliability factor
Veillette et al. Creating synthetic radar imagery using convolutional neural networks
Li et al. Extracting check dam areas from high‐resolution imagery based on the integration of object‐based image analysis and deep learning
Feng et al. Impervious surface extraction based on different methods from multiple spatial resolution images: a comprehensive comparison
Ghosh et al. Mapping of debris-covered glaciers in parts of the Greater Himalaya Range, Ladakh, western Himalaya, using remote sensing and GIS
Kan et al. Snow Cover Mapping for Mountainous Areas by Fusion of MODIS L1B and Geographic Data Based on Stacked Denoising Auto-Encoders.
CN115147727A (en) Method and system for extracting impervious surface of remote sensing image
Lu et al. Multiscale superpixel-based active learning for hyperspectral image classification
Ashesh et al. Accurate and clear quantitative precipitation nowcasting based on a deep learning model with consecutive attention and rain-map discrimination
Matsuoka et al. Automatic detection of stationary fronts around Japan using a deep convolutional neural network
Yao et al. Cloud detection in optical remote sensing images with deep semi-supervised and active learning
Ren et al. Cycle-consistent adversarial networks for realistic pervasive change generation in remote sensing imagery
Li et al. Change detection in synthetic aperture radar images based on log-mean operator and stacked auto-encoder
Lyu et al. A deep information based transfer learning method to detect annual urban dynamics of Beijing and Newyork from 1984–2016
Hao et al. A subpixel mapping method for urban land use by reducing shadow effects
CN113642663B (en) Satellite remote sensing image water body extraction method
Menze et al. Multitemporal fusion for the detection of static spatial patterns in multispectral satellite images—With application to archaeological survey
Guo et al. Abandoned terrace recognition based on deep learning and change detection on the Loess Plateau in China
Scharenbroich et al. A Bayesian framework for storm tracking using a hidden-state representation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination