WO2024061141A1

WO2024061141A1 - Method for remote-sensing sample transfer under common knowledge constraints

Info

Publication number: WO2024061141A1
Application number: PCT/CN2023/119287
Authority: WO
Inventors: 刘聪; 陈婷; 王婷; 贾若愚; 彭哲; 李洁; 邹圣兵
Original assignee: 北京数慧时空信息技术有限公司
Priority date: 2022-09-21
Filing date: 2023-09-18
Publication date: 2024-03-28
Also published as: CN115620038A

Abstract

The present invention relates to the field of remote-sensing image processing. Disclosed is a method for remote-sensing sample transfer under common knowledge constraints. The method comprises the following steps: by using a feature extraction mode, acquiring features and a public feature space of source-domain samples and target-domain samples; by using a feature clusterer, determining common features and non-common features; inputting into a generator the common features and source-domain non-common features to which random noise is added, so as to generate a pseudo sample; inputting the pseudo sample into a discriminator, and discriminating the pseudo sample according to target-domain sample data; iteratively training and optimizing the generator, so as to obtain a trained generator; and inputting source-domain sample data into the trained generator, so as to generate transferred samples. Therefore, sample transfer from source domains to target domains under common-feature constraints is realized, avoiding negative transfer.

Description

Remote sensing sample migration method with common knowledge constraints

Technical field

The invention relates to the field of remote sensing image processing, and in particular to a remote sensing sample migration method constrained by common knowledge.

Background technique

In recent years, the rapid development of remote sensing technology has promoted its wide application in various fields. Among them, the real-time monitoring of the earth by multiple satellites provides massive and diverse remote sensing image data support for the development of the entire remote sensing field, laying the foundation for the rapid development of remote sensing technology. Effective utilization of massive remote sensing image data is one of the important directions for the development of the field of remote sensing.

As one of the hottest high-tech technologies today, artificial intelligence technology has huge development prospects. In fact, the implementability of artificial intelligence technology relies largely on the support of massive data and big data. Therefore, artificial intelligence technology can also effectively use massive data to implement various functions.

Applying artificial intelligence technology to the field of remote sensing can greatly improve the utilization of massive remote sensing image data. However, most current artificial intelligence applications in remote sensing use supervised learning or semi-supervised learning, which cannot directly use massive remote sensing image data and need to rely on annotated remote sensing samples. It is difficult to obtain labeled remote sensing samples. At present, in addition to manual sample labeling methods that are highly accurate but have high labor costs and low efficiency, there are also studies on sample labeling using machine learning methods, but they have not yet reached the level of large-scale engineering implementation. standard. Therefore, how to maximize and effectively utilize existing remote sensing annotation samples is one of the current research directions.

Transfer learning methods in remote sensing samples can effectively improve the utilization of existing samples. The purpose of transfer learning is to use sufficient labeled samples in the source domain for a target domain with a small number of samples or no samples. There is partial or no correlation between the sample features between the source domain and the target domain. An existing transfer method, such as TrAdaBoost, selects highly relevant source domain sample data for transfer learning, thereby improving the performance of transfer learning. However, this method of optimizing source domain samples mainly has the following problems: even for the optimized samples, not all features in the samples are beneficial to transfer learning, and some features with low correlation will have a negative impact on transfer learning. Even leading to negative transfer, it is necessary to ensure that the source domain and the target domain are fully related.

Contents of the invention

The present invention proposes a remote sensing sample migration method constrained by common knowledge, which can solve the above-mentioned problems of the prior art. It does not affect the distribution of source domain sample data by directly transferring The common features of the source domain sample data and the target domain sample data are input into the constructed generator, and the generator is trained based on the objective function to obtain the generator under the constraints of common knowledge. Input the source domain sample data into the generator under the common knowledge constraint to generate sample data fitting the target domain, realizing sample migration under the common knowledge constraint and effectively avoiding negative transfer. At the same time, the entire sample migration process is end-to-end, enabling automatic adjustment of the model.

In order to achieve the above technical objectives, the technical solutions of the present invention are as follows:

A remote sensing sample migration method constrained by common knowledge, which includes the following steps:

S1 inputs the source domain sample data and the target domain sample data into the feature extraction model to obtain the feature data and common feature space of the source domain sample data and the target domain sample data;

S2 inputs the feature data of the source domain sample data and the target domain sample data into a feature clusterer, determines a common feature space and a non-common feature space, and extracts common features and source domain non-common features, wherein the common feature space and the non-common feature space are subspaces of the common feature space;

S3 inputs the common features and the non-common features of the source domain added with random noise into the generator to generate pseudo samples;

S4 inputs the pseudo sample and target domain sample data into the discriminator, discriminates the pseudo sample according to the target domain sample data, and optimizes the generator according to the discrimination result and the objective function;

S5 iterates the training process from S3 to S4 until the objective function converges;

S6 inputs the source domain sample data into the trained generator to generate migration samples.

Optionally, in step S2, input the feature data of the source domain sample data and the target domain sample data into a feature clusterer to determine the common feature space and the non-common feature space, including:

In the common feature space, clustering the source domain sample data and the target domain sample data to obtain k groups of mixed sample data x _i , i=1,...,k, wherein each group of mixed sample data contains feature correlation;

Map the mixed sample data x _i to multiple feature subspaces F _j of the common feature space to obtain a sample-feature set (x _i , F _j );

Analyze the distribution of (x _i ,F _j ):

If the distribution has correlation, the feature subspace F _j is divided into a common feature space; if the distribution does not have correlation, the feature subspace F _j is divided into a non-common feature space.

Optionally, the way to determine whether the distribution is relevant is:

Use the probability distribution distance measurement algorithm to calculate the fitting degree of the characteristic distribution of each group of mixed samples in the same feature subspace F _j and other groups of mixed samples;

According to the fitting degree, each group of mixed samples in the same feature subspace F _j is obtained The overall fitting degree of features (x _i ,F _j );

The number M of mixed samples whose overall fitting degree is greater than the first preset threshold is counted. When M is greater than or equal to the second preset threshold, it is determined that the feature subspace is a distribution with correlation.

Optionally, the fitting degree is calculated according to a probability distribution distance measurement algorithm.

Optionally, the feature extraction model is:

Machine learning model using feature extraction operators;

Or a model built by a convolutional neural network;

Or a combination model of the machine learning model and the model constructed by the convolutional neural network.

Optionally, the feature extraction model is the encoder part of a convolutional autoencoder constructed by a convolutional neural network;

Correspondingly, the generator is the decoder part of the convolutional autoencoder.

Optionally, the feature extraction model is symmetrical to the structure of the generator.

Optionally, the generator and the discriminator constitute a generative adversarial network, and the objective function is the objective function of the generative adversarial network, and the objective function is:

Among them, G is the generator, D is the discriminator, E is the expectation function, x is the pseudo sample data generated by the generator, p _data is the probability that x comes from the real data distribution, and p _g is the probability that x comes from the generator output sample.

The present invention proposes a remote sensing sample migration method constrained by common knowledge. By performing feature extraction on source domain and target domain sample data, the characteristic data corresponding to the sample data is obtained. By performing cluster analysis on the characteristic data, the source domain and target domain are obtained. Common features and non-common features of domain samples, by inputting common features and source domain noisy non-common features into the generator, generate pseudo samples under the constraints of common features, and by inputting pseudo samples and target domain samples into the discriminator, based on the discrimination results Iteratively train and optimize the generator with the objective function to obtain a generator that can migrate the source domain features to the target domain. Finally, input the source domain samples into the generator to directly obtain the migration samples that fit the target domain. The beneficial effects of the present invention are:

(1) With the technical support of the present invention, it is possible to input source domain sample data into a generator under the constraints of common knowledge without affecting the distribution of source domain sample data, and generate sample data that fits the target domain, achieving commonality. Sample migration under knowledge constraints effectively avoids negative migration.

(2) The sample migration framework constructed by the present invention supports fully automatic training and adjustment of the model, and realizes an end-to-end sample migration process.

Description of drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required for use in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.

Figure 1 is a schematic flow chart of an embodiment of the common knowledge constrained remote sensing sample migration method of the present invention;

Figure 2 is a schematic diagram of sample migration model data transmission in one embodiment of the common knowledge-constrained remote sensing sample migration method of the present invention;

Figure 3 is a schematic diagram of using a trained autoencoder to perform sample migration on source domain samples in one embodiment of the common knowledge constrained remote sensing sample migration method of the present invention.

Detailed ways

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art fall within the scope of protection of the present invention.

Please refer to Figure 1. Figure 1 is a schematic flow chart of an embodiment of a common knowledge-constrained remote sensing sample migration method according to the present invention. Compared with the traditional migration learning model, this method achieves the generation of fitting targets by inputting source domain samples. Samples in the domain realize sample migration under common knowledge constraints and effectively avoid negative migration. This method includes the following steps:

S1 inputs the source domain sample data and the target domain sample data into the feature extraction model to obtain the feature data and common feature space of the source domain sample data and the target domain sample data.

It should be noted that the embodiment of the present invention is aimed at the migration task of remote sensing samples, which belongs to the category of isomorphic transfer learning. The source domain and target domain sample data are in the same feature space, that is, the common feature space.

Optionally, the feature extraction model is: a machine learning model using a feature extraction operator, or a convolutional neural network model, or a combination of the above models.

In this embodiment, the encoder part of the convolutional autoencoder constructed using a convolutional neural network is used as a feature extraction model. The autoencoder serves as a powerful feature detector and can efficiently represent the learned input data as features through unsupervised learning. . The encoder of the convolutional autoencoder adopts a three-layer convolutional neural network structure. The number of convolution kernels in the first layer is 16, the convolution kernel size is 3×3, and the step size is 1; the number of convolution kernels in the second layer is 8 , the convolution kernel size is 3×3, and the step size is 1; the number of convolution kernels in the third layer is 8, the convolution kernel size is 3×3, and the step size is 1. After each convolutional layer, a 2×2 maximum pooling layer is connected for dimensionality reduction and feature compression.

S2 combines the feature data of the source domain sample data and the target domain sample data Input the feature clusterer, determine the common feature space and non-common feature space, and extract common features and non-common features of the source domain, where the common feature space and the non-common feature space are subspaces of the common feature space.

It should be noted that the sample data of the source domain and the target domain are both in the common feature space, but the specific distribution is different. The feature data can be divided into common features and non-common features according to the data distribution. The subspace where the common features are located is a common feature space, and the subspace where non-common features are located is a non-common feature space.

In this embodiment, on the common feature space, the source domain sample data and the target domain sample data are divided into k groups through collaborative clustering processing to obtain k groups of mixed sample data x _i , i=1,...,k. Each group of mixed sample data has feature correlation. The sample feature values are normalized to facilitate subsequent analysis and input into the generative adversarial network. The mixed sample data x _i is mapped to multiple feature subspaces F _j to obtain a sample-feature set (x _i ,F _j ). The distribution of (x ₁ ,F _j ),(x ₂ ,F _j ),...,(x _k ,F _j ) is analyzed. If the distribution is correlated, the feature subspace Fj is divided into a common feature space; if the distribution is not correlated, the feature subspace Fj is divided into a non-common feature space.

Among them, the way to determine whether the distribution is relevant is:

Calculate the degree of fit between each group of mixed samples and the characteristic distribution of other groups of mixed samples in the same characteristic subspace _Fj ;

According to the fitting degree, the overall fitting degree of each group of mixed sample features (x _i , F _j ) in the same feature subspace F _j is obtained;

Specifically, the steps for correlation analysis of the feature distribution of the mixed samples in F ₁ are as follows: use the probability distribution distance measurement algorithm to calculate the fitting degree of the feature distribution of each group of mixed samples and other groups of mixed samples in the same feature subspace F _j , According to the fitting degree, the overall fitting degree of each group of mixed sample features (x _i , F _j ) in the same feature subspace F _j is obtained, and the number M of mixed samples whose overall fitting degree is greater than the first preset threshold is counted. , when M is greater than or equal to the second preset threshold, it is determined that the feature subspace is a distribution with correlation. Among them, the degree of fitting can be expressed by KL divergence. The smaller the KL divergence, the higher the degree of fitting. The overall degree of fitting is a set of KL divergence. The second preset threshold can be 80% of the total number, or can also be other parameters. This embodiment does not limit this. For example, the KL divergence is calculated for each two sets of mixed sample data ( _xi , F ₁ ), and we get KL divergence value, convert this Each KL divergence value is compared with the preset threshold. If the overall fitting degree corresponding to more than 80% of the KL divergence values is greater than the first preset threshold, it is determined that the feature subspace F ₁ distribution is relevant. If the distribution is correlated, the feature subspace F ₁ is divided into a common feature space. If the distribution is not correlated, the feature subspace F ₁ is divided into a non-common feature space. According to the method described above, all feature subspaces F _j is divided into common feature space or non-common feature space Conquer space.

It can be understood that the sample data in the source domain and the target domain have relevant features and irrelevant features in the same feature space. Using the collaborative clustering method can simultaneously cluster the sample data and feature data in the same feature space, which can be intuitively Reflect the relationship between source domain and target domain sample data and features. The present invention aims to distinguish relevant features from irrelevant features and divide them into common feature spaces and non-common feature spaces respectively. The distribution of the normalized feature values extracted from the sample on the same feature subspace reflects the characteristics of the sample in this feature subspace. On the same common feature subspace, the source domain sample data and the target domain sample data should have distribution consistency. , different groups of data should also have similar distributions, so by analyzing the correlation between the feature distributions of different groups of data in the same feature subspace, the feature subspace can be divided into common feature space and non-common feature space.

With the technical support of the present invention, sample migration under the constraints of common knowledge can be realized without affecting the distribution of source domain sample data, effectively avoiding negative migration.

S3 inputs the common features and the non-common features of the source domain added with random noise into the generator to generate pseudo samples.

S4 inputs the pseudo sample and the target domain sample data into the discriminator, discriminates the pseudo sample according to the target domain sample data, and optimizes the generator according to the discrimination result and the objective function.

S5 repeats the training process of iterations S3 to S4 until the objective function converges.

It should be noted that the generator used in the present invention has a symmetrical structure with the feature extraction model used for feature extraction. The purpose is to directly input the extracted common features into the generator and generate samples with minimal loss. An embodiment of the present invention The entire transfer learning model and data transfer are shown in Figure 2. Directly inputting common features into the generator to generate samples can effectively constrain the common features and ensure that the generated samples fit the target domain samples as closely as possible without changing the common features.

In this embodiment, the decoder part of the convolutional autoencoder is used as the generator, and the decoder structure is symmetrical with the encoder structure. The decoder adopts a three-layer deconvolution neural network structure, and the number of convolution kernels in the first layer is 8, the convolution kernel size is 3×3, and the step size is 1; the number of convolution kernels in the second layer is 8, the convolution kernel size is 3×3, and the step size is 1; the number of convolution kernels in the third layer is 16, The convolution kernel size is 3×3 and the stride is 1. Each deconvolution layer is followed by a 2×2 upsampling layer to restore the image size.

In this embodiment, the same autoencoder is used, which improves the reusability of the model and reduces the cost of model construction. The generator and the discriminator form a non-traditional generative adversarial network whose input is a feature. The objective function of the generative adversarial network is Among them, G is the generator, D is the discriminator, E is the expectation function, x is the pseudo sample data generated by the generator, p _data is the probability that x comes from the real data distribution, and p _g is the probability that x comes from the generator output sample. will share Characteristic features and non-common features of the source domain added with random noise are normalized and then input into the decoder part of the convolutional autoencoder to generate pseudo samples. Input the pseudo samples and target domain sample data into the discriminator, discriminate the pseudo samples based on the target domain sample data, and optimize the entire generative adversarial network based on the discrimination results and the objective function. The above training process is iterated repeatedly until the objective function converges.

It can be understood that the trained generator can generate sample data to fit the target domain by receiving source domain sample data to realize sample migration. The process of realizing sample migration in this embodiment is shown in Figure 3. Source domain samples After passing through the autoencoder, migration samples that fit the target domain can be generated.

The sample migration framework constructed by the present invention supports fully automatic training and adjustment of the model, and realizes an end-to-end sample migration process. After inputting the source domain samples, the samples fitting the target domain are automatically generated.

It should be noted that, in this document, the terms "comprising", "comprises" or any other variations thereof are intended to cover a non-exclusive inclusion, such that a process, method, article or device that includes a series of elements not only includes those elements, It also includes other elements not expressly listed or inherent in the process, method, article or apparatus. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, method, article or apparatus that includes that element.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

A remote sensing sample migration method subject to common knowledge constraints, which is characterized by including the following steps:

S1 inputs the source domain sample data and the target domain sample data into the feature extraction model to obtain the feature data and common feature space of the source domain sample data and the target domain sample data;

S2 Input the feature data of the source domain sample data and the target domain sample data into a feature clusterer, determine the common feature space and non-common feature space, and extract the common features and the source domain non-common features, where the common features The space and the non-common feature space are subspaces of the common feature space;

S3 inputs the common features and non-common features of the source domain added with random noise into the generator to generate pseudo samples;

S4 inputs the pseudo sample and target domain sample data into the discriminator, discriminates the pseudo sample according to the target domain sample data, and optimizes the generator according to the discrimination result and the objective function;

S5 iterates the training process from S3 to S4 until the objective function converges;

S6 inputs the source domain sample data into the trained generator to generate migration samples.
The remote sensing sample migration method constrained by common knowledge according to claim 1, characterized in that, in step S2, the characteristic data of the source domain sample data and the target domain sample data are input into a feature clusterer to determine common features. Space and non-common feature space, including:

On the common feature space, the source domain sample data and the target domain sample data are clustered to obtain k groups of mixed sample data x i , i=1,...,k, where each mixed group The sample data contains feature correlations;

Map the mixed sample data x i to multiple feature subspaces F j of the common feature space to obtain a sample-feature set (x i , F j );

Analyze the distribution of (x i ,F j ):

If the distribution has correlation, the feature subspace F j is divided into a common feature space; if the distribution does not have correlation, the feature subspace F j is divided into a non-common feature space.
The remote sensing sample migration method constrained by common knowledge according to claim 2 is characterized in that the method for determining whether the distribution has correlation is:

Calculate the fitting degree of the feature distribution of each group of mixed samples in the same feature subspace F j and other groups of mixed samples;

According to the fitting degree, the overall fitting degree of each group of mixed sample features (x i , F j ) in the same feature subspace F j is obtained;

The number M of mixed samples whose overall statistical fitting degree is greater than the first preset threshold, in M When it is greater than or equal to the second preset threshold, it is determined that the characteristic subspace is a distribution with correlation.
The remote sensing sample migration method constrained by common knowledge according to claim 3, characterized in that the degree of fitting is calculated according to a probability distribution distance measurement algorithm.
The remote sensing sample migration method constrained by common knowledge according to claim 1, characterized in that the feature extraction model is:

Machine learning model using feature extraction operators;

Or a model built by a convolutional neural network;

Or a combination model of the machine learning model and the model constructed by the convolutional neural network.
The remote sensing sample migration method constrained by common knowledge according to claim 5, characterized in that the feature extraction model is the encoder part of a convolutional autoencoder constructed by a convolutional neural network;

Correspondingly, the generator is the decoder part of the convolutional autoencoder.
The remote sensing sample migration method constrained by common knowledge according to claim 6, characterized in that the feature extraction model is symmetrical to the structure of the generator.
The remote sensing sample migration method constrained by common knowledge according to claim 1, characterized in that the generator and the discriminator constitute a generative adversarial network, the objective function is the objective function of the generative adversarial network, and the The objective function is:

Among them, G is the generator, D is the discriminator, E is the expectation function, x is the pseudo sample data generated by the generator, p data is the probability that x comes from the real data distribution, and p g is the probability that x comes from the generator output sample.