CN116977744A

CN116977744A - Small sample cross-domain hyperspectral image classification method based on contrast learning and subspace

Info

Publication number: CN116977744A
Application number: CN202310979029.8A
Authority: CN
Inventors: 慕彩红; 张富贵; 刘逸; 陈素玲; 王蓉芳; 冯婕
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2023-08-04
Filing date: 2023-08-04
Publication date: 2023-10-31

Abstract

The invention discloses a small sample cross-domain hyperspectral image classification method based on contrast learning and subspace, which mainly solves the problems of low utilization rate of embedded characteristic information, high training cost and low precision of a conventional method. The scheme comprises the following steps: acquiring a hyperspectral data set, and dividing a source domain and a target domain to acquire a training set and a testing set; constructing a mapping layer and a feature extractor, and respectively extracting embedded features of a source domain and a target domain by using the mapping layer and the feature extractor to perform subspace classification and contrast learning; constructing a domain discriminator, and calculating the total loss of domain discrimination loss, space loss and contrast loss of a source domain and a target domain; performing iterative training on the feature extractor by using the training set through back propagation until the total loss function converges; and inputting the test set into a trained feature extractor to obtain a classification result. The invention improves the distinguishing capability and the classifying precision of the classifier, improves the utilization rate of embedded characteristic information, reduces the training cost, and can be used for land feature classification of resource exploration, forest coverage and disaster monitoring.

Description

Small sample cross-domain hyperspectral image classification method based on contrast learning and subspace

Technical Field

The invention belongs to the technical field of image processing, and further relates to a small sample cross-domain hyperspectral image classification method which can be applied to ground object classification and target identification in resource exploration, forest covering and disaster monitoring.

Background

Compared with the traditional image, the hyperspectral image HSI contains a plurality of continuous wave bands, contains a large amount of spectrum and space information, and is widely applied to the fields of environment monitoring, disaster prevention and the like due to the fact that the hyperspectral image HSI is rich in information. In practical remote sensing applications, there is often a limited number of marked samples available for training, a challenge called small sample classification task. In the early days, many machine learning methods were developed for small sample scene classification, such as principal component analysis, filtering, etc., but the classification effect was not ideal. Then, with the development of deep learning, various neural networks have made great progress in the field of classification of hyperspectral images of large samples, but have still progressed slowly in the field of small samples. In recent years, in order to solve the problem of small samples, many methods such as transfer learning, active learning, contrast learning, meta learning, and the like have appeared. Specialized small sample learning, such as prototype networks, twinning networks, relational networks, etc., have also been developed. Today, researchers are beginning to focus on how to extend small sample learning into a cross-domain environment, solving hyperspectral image classification where the source domain dataset is different from the target domain data.

Gao et al in its published paper "Deep relation network for hyperspectral image few-shot classification, remote sens, vol.12, no.6, p.923, mar.2020" propose a small sample classification method based on a relational network. The method comprises three steps: firstly, performing meta-learning on a source domain HSI data set by using a designed feature learning module and a relation learning module; secondly, fine tuning the network model by using a small number of labeled samples in the target domain data set; third, the classification performance is tested using the target HSI dataset. The method uses a fine tuning means to adjust domain offset between a source domain and a target domain, so that the efficiency is low, the stability and the robustness of a network are not facilitated, and meanwhile, a relationship network adopted by the method has a great room for improvement.

Li et al in its published paper "Deep cross-domain few-shot learning for hyperspectral image classification, IEEE Trans. Geosci. Remote Sens., vol.60, pp.1-18,2021," propose to overcome domain bias using a conditional anti-domain adaptation strategy to achieve domain distribution alignment. Firstly, a unified framework is established, and source domain data and target domain data are mapped to the same dimension through a mapping layer; extracting embedded features by using the same feature extractor, and simultaneously carrying out small sample learning in a source domain and a target domain; and the source domain and target domain data are identified by using a conditional domain identifier, so that inter-domain offset is relieved, and cross-domain small sample classification is realized. The method adopts a small sample method based on a prototype network to classify, and the prototype uses the mean value of the embedded features, so that the utilization rate of the embedded feature information is limited, and higher classification precision cannot be achieved.

Zhang et al in its published paper "Graph information aggregation cross-domain new-shot learning for hyperspectral image classification IEEE Trans.Neal Net. Learn. Syst., early access, jun.30,2022" propose a graph information aggregation cross-domain few-sample learning framework. The implementation steps are as follows: firstly, dividing a training set and a testing set on a source domain and a target domain, and mapping to the same dimension through a mapping layer; secondly, after extracting embedded features of a source domain and a target domain, learning the similarity between a support set and a query set by using a metric function; third, the ided blocks and csa blocks of the source domain and the target domain are designed, and domain alignment is performed from two layers of graph characteristics and graph distribution. The method has the defects that the embedded characteristics of the source domain and the target domain are processed too much, so that the training cost is high, and the utilization of other information is limited.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a small sample cross-domain hyperspectral image classification method based on contrast learning and subspace, so as to improve the utilization rate of embedded characteristic information, reduce training cost and further improve classification precision.

The technical idea for realizing the purpose of the invention is as follows: classifying the embedded features of the source domain and the target domain by using a subspace improved prototype network, so that the utilization rate of the embedded feature information is improved; constructing an integral model for connecting a source domain and a target domain through a domain discriminator, realizing domain alignment, and simultaneously reducing training cost; by introducing contrast learning to analyze the correlation and the difference between the source domain samples and combining subspace and domain discrimination, the classification precision of the cross-domain small samples is further improved.

According to the above idea, the implementation steps of the invention include the following:

(1) Acquiring a hyperspectral data set and dividing:

(1a) G hyperspectral data sets are obtained from a public website, one data set is used as a source domain, and the remaining G-1 data sets are used as target domains;

(1b) C categories are selected from the source domain data set, and K source domain samples are selected as a source domain support set S from each category ^s Then N other source domain samples are selected to form a source domain query set Q ^s ；

(1c) H tagged categories are selected from the target domain data set, and K target domain tagged samples are selected from each category as a target domain support set S ^t N other target domains are selected again by each class and have label samples to form a target domain query set Q ^t Support set S with target domain ^t Query set Q with target domain ^t Together form a training set T ^r The method comprises the steps of carrying out a first treatment on the surface of the Taking the rest target domain unlabeled sample as a test set T ^e ；

(2) Constructing a mapping layer comprising a two-dimensional convolution layer and a BatchNorm2d layer; constructing a feature extractor f comprising two residual blocks, two max-pooling layers, one convolutional layer and one flame layer _θ ；

(3) Subspace classification is carried out on embedded features of a source domain and a target domain:

(3a) Support set S of source domain ^s Source domain query set Q ^s Target domain support set S ^t Target domain query set Q ^t Respectively input to the mapping layer and the feature extractor f _θ Obtaining the embedding characteristic f of the source domain support set _θ (S ^s ) Embedding feature f in source domain query set _θ (Q ^s ) Target domain support set embedding feature f _θ (S ^t ) Target domain query set embedding feature f _θ (Q ^t )；

(3b) Embedding feature f with source domain support set _θ (S ^s ) Constructing a source domain subspace and calculating a source domain query set embedding feature f _θ (Q ^s ) Source domain subspace loss in source domain subspace/ _S ；

(3c) Computing source domain support set embedding features f _θ (S ^s ) Source domain support loss in source domain subspace

(3d) Embedding feature f with target domain support set _θ (S ^t ) Constructing a target domain subspace and calculating an embedded feature f of a target domain query set _θ (Q ^t ) Target domain subspace loss in target domain subspace/ _T ；

(3e) Computing target domain support set embedding features f _θ (S ^t ) Target domain support loss in target domain subspace

(4) Support set S of source domain ^s Dividing into C contrast groups, and calculating contrast loss l of the input mapping layer and the feature extractor _CL ；

(5) Constructing a domain discriminator f comprising five full connection layers, four ReLU nonlinear activation functions, four dropout layers, a Softmax activation function _D ；

(6) Calculating a domain discrimination penalty l _D ：

(6a) Loss of source domain subspace/ _S Loss of source domain supportSource domain support set embedding feature f _θ (S ^s ) Embedding feature f in source domain query set _θ (Q ^s ) Fusion is performed to obtain a source domain fusion feature (T (h ^s ))；

(6b) Loss of target domain subspace/ _T Loss of target domain supportTarget domain support set embedding feature f _θ (S ^t ) Target domain query set embedding feature f _θ (Q ^t ) Fusion is performed to obtain a target domain fusion feature (T (h) ^t ))；

(6c) The fusion characteristics of the source domain and the target domain (T (h ^s ))、(T(h ^t ) Input field discriminator f) _D Medium calculation domain discrimination loss l _D ：

Wherein D, T respectively represent the fusion characteristics of the domain discriminator and the characteristic extractor, and x ^s ，x ^t Respectively represent source domain sample and destinationStandard domain sample, P _s (x)，P _t (x) Representing the source domain distribution and the target domain distribution, respectively, D (T (h ^s ) Representing a discriminator f _D Prediction source domain fusion features (T (h) ^s ) A) the probability of originating from the source domain,representation discriminator f _D Predicting target domain fusion features The probability from the target domain, E, is the expected output;

(7) Computing a total loss function to train the feature extractor:

calculating the subspace loss l of the source domain _S Target domain subspace loss l _T Loss of contrast l _CL Domain discrimination loss l _D Total loss function of composition: l=λ ₁ l _S +λ ₂ l _T +λ ₃ l _CL +l _D, wherein λ₁ 、λ ₂ 、λ ₃ Respectively is l _S 、l _T 、l _CL Is provided;

feature extractor f by back propagation _θ Iterative training is carried out until the total loss function converges, and a trained feature extractor f is obtained _θ '；

(8) Classifying by using test set data:

(8a) Training set T ^r Input to the mapping layer and trained feature extractor f _θ In' get training set embedding feature f _θ '(T ^r ) By utilizing the characteristic f _θ '(T ^r ) Constructing a training set subspace;

(8b) Test set T ^e Input to the mapping layer and trained feature extractor f _θ In' get test set embedded feature f _θ '(T ^e ) And calculating a test subspace loss l' thereof in the training set subspace;

(8c) And taking the smallest category in the test subspace loss l' as a prediction category to finish the classification task.

Compared with the prior art, the invention has the following advantages:

firstly, the invention adopts subspace classification calculation support set and query set to embed the loss of the characteristics, thus solving the problem that the conventional prototype network only takes the average value of the embedded characteristics as the prototype, resulting in the loss of the utilization rate of multidimensional data, improving the discrimination capability and generalization performance of the classifier and improving the utilization rate of the embedded characteristic information.

Secondly, the invention uses the domain discriminator to project the fusion characteristics of the source domain and the target domain into the public space, thereby overcoming the complex setting and adjustment of the conventional domain adaptation method and the migration method to the cross-domain data; meanwhile, as the cross-domain data is subjected to feature learning and representation learning in a generation-countermeasure mode, an integral model for connecting the source domain and the target domain is constructed to realize domain alignment, and training cost is reduced.

Thirdly, the invention uses contrast learning to train the source domain support set data, thereby overcoming the bottleneck of precision improvement caused by the similarity and the difference of neglecting the same type samples in the conventional method, and further improving the classification precision of the cross-domain small samples.

Drawings

FIG. 1 is a flow chart of an implementation of the present invention;

FIG. 2 is a schematic diagram of a feature extractor constructed in accordance with the present invention;

fig. 3 is a schematic diagram of a domain discriminator constructed in the invention.

Detailed Description

The invention will be described in further detail with reference to the drawings and the specific embodiments, it being understood that the specific embodiments are for illustration only and not for limitation of the invention.

Referring to fig. 1, the implementation steps for this example include the following:

step 1, carrying out cross-domain small sample division on a data set.

The cross-domain small sample classification refers to classification under the condition that fewer labeled samples are available for training and source domain data and target domain data are different.

The specific implementation of the steps is as follows:

g hyperspectral data sets are obtained from the public website, one data set is used as a source domain, and the remaining G-1 data sets are used as target domains;

for a source domain data set, C classes are randomly selected from the source domain data set, and K samples are randomly extracted from each class to serve as a source domain support set, which is called C-wayK-shot for short;

representing a source domain support set asThen N samples are selected from the C classes as a source domain query set +.>Wherein x represents a sample and y represents a label of the sample;

for the target domain data set, H tagged categories are selected from the target domain data set, and K target domain tagged samples are selected from each category as target domain support setsSelecting N other target domains with label samples to form target domain query set

Support set S with target domain ^t Query set Q with target domain ^t Together form a training set T ^r Taking the rest target domain unlabeled sample as a test set T ^e ；

This example selects, but is not limited to, c=19, k=2, n=19, g=4, h selects 16,9, 16, respectively, in the 3 target domain datasets.

And 2, constructing a mapping layer and a feature extractor.

2.1 Building a mapping layer:

2.1.1 The input of the two-dimensional convolution layer is selected as the input data dimension, the output is 100, the convolution kernel size is 1 multiplied by 1, and the step length is 1;

2.1.2 Selecting two-dimensional batch normalizationAs a BatchNorm2d layer, where x represents the input data, mean (x) represents the mean, var (x) represents the variance, eps is a minimum, γ, β are two different values of the adjustment parameters, in this example, γ=1, β=0;

2.1.3 A mapping layer formed by sequentially connecting a two-dimensional convolution layer and a BatchNorm2d layer so as to map a source domain and a target domain to the same dimension, so that the source domain and the target domain can be trained in the same feature extractor;

2.2 Constructing a feature extractor:

referring to fig. 2, the implementation of this step includes the following:

2.2.1 A convolution block consisting of a 3D convolution layer and a BatchNorm3D layer is selected, the convolution kernel size of the 3D convolution layer is 3 multiplied by 3, the step length is 1, and the filling is 1; the function of each ReLU layer is ReLU (x) =max (x, 0);

2.2.2 Two residual blocks with the same structure are selected, each residual block comprises 3 convolution blocks and 3 ReLU layers, and the structural relation is as follows: the first convolution block, the first ReLU layer, the second convolution block, the second ReLU layer, the third convolution block and the third ReLU layer, and the output of the first ReLU layer is connected with the third convolution block in a residual way;

2.2.3 The two largest pooling layers with different parameters are selected, wherein the pooling core size of the first pooling layer is 4 multiplied by 2, the filling is 0 multiplied by 1, and the step length is 4 multiplied by 2; the size of the pooling core of the second pooling layer is 4 multiplied by 2, the filling is 4 multiplied by 2, and the step size is 2 multiplied by 1;

2.2.4 A convolution layer with the output channel number of 32 and the convolution kernel size of 3 multiplied by 3 is selected;

2.2.5 A first residual block, a first pooling layer, a second residual block, a second pooling layer and a convolution layer are sequentially connected to form a feature extractor f _θ The training device is used for training the input data to obtain the embedded characteristics of the input data.

And step 3, classifying subspaces of the embedded features of the source domain and the target domain.

Currently, common small sample learning classification methods include, for example, twin networks, relational networks, prototype networks, and the like; the twin network classifies by minimizing losses between embedded features of the same class and maximizing losses between embedded features of different classes; the relation network analyzes the matching degree by calculating the distance between the embedded features of the two samples to classify; the prototype network classifies the embedded features of the query set by calculating the distance from the embedded features of the support set to the average value of the embedded features of the support set; the method adopts, but is not limited to, a subspace improvement prototype network, and classifies the query set embedded features by calculating subspace distances from the query set embedded features to the support set embedded features, so that the discrimination capability and generalization performance of the classifier are enhanced, and the utilization rate of the embedded feature information is improved. The implementation steps comprise the following steps:

3.1 Source domain support set S) ^s Source domain query set Q ^s Target domain support set S ^t Target domain query set Q ^t Respectively inputting to a mapping layer and a feature extractor to obtain a source domain support set embedded feature f _θ (S ^s ) Embedding feature f in source domain query set _θ (Q ^s ) Target domain support set embedding feature f _θ (S ^t ) Target domain query set embedding feature f _θ (Q ^t )；

3.2 Embedding features f using source domain support sets _θ (S ^s ) Constructing a source domain subspace and calculating a source domain query set embedding feature f _θ (Q ^s ) Source domain subspace loss in source domain subspace/ _S ：

3.2.1 Embedding features f) into a source domain support set _θ (S ^s ) Singular value decomposition f _θ (X _c )＝B _c Σ _c V _c ^T, wherein X_c Representing a source domain support set S ^s Class c, B _c Called left singular vectors, V _c ^T Called right singular vectors, T represents the transpose, Σ _c Is a singular value matrix;

3.2.2 Selecting left singular vector B) _c Form a truncated matrix P from the first n dimensions of (1) _c And P is taken _c A base regarded as a source domain subspace; in this example

3.2.3 Computing a source domain query set embedding feature f _θ (Q ^s ) Sample q of (3) _k In the source domain subspace P _c Projection distance d of (2) _c (q _k )：

d _c (q _k )＝-||(I-M _c )(fθ(q _k )-μ _c )|| ²

wherein M_c ＝P _c P _c ^T ，μ _c Embedding features S for source domain support sets ^s Average value of (2);

3.2.4 Computing source domain query set embedding feature f using Softmax function _θ (Q ^s ) Sample q of (2) _k Probability of belonging to class c

3.2.5 I-th base P of computation source domain subspace _i And the j-th base P _j Projection distance between

wherein Representation matrix P _i ^T P _j Sum of squares of all elements in (i) _F Representing the F norm;

3.2.6 Calculating the source domain subspace loss function l based on the results of steps 3.2.4) and 3.2.5) _S ：

wherein M^s Query set Q for source domain ^s Number of samples, lambda ^s For parameters that can be adjusted according to the experiment, λ is selected in this example ^s ＝0.03；

3.3 Using and calculating source domain subspace loss l _S The same method calculates the embedding characteristic f of the source domain support set _θ (S ^s ) Source domain support loss in source domain subspace

3.3.1 Computing source domain support set embedding feature f _θ (S ^s ) Samples of (3)In the source domain subspace P _c Projection distance in (a)

3.3.2 Computing source domain support set embedding feature f using Softmax function _θ (S ^s ) Samples of (3)Probability of belonging to class c

3.3.3 Calculating source domain support loss based on the results of steps 3.3.1) and 3.3.2)

wherein Support set S for source domain ^s Sample number,/->For parameters which can be adjusted experimentally, in this example, but not limited to +.>

3.4 Embedding features f using a target domain support set _θ (S ^t ) Constructing a target domain subspace, and calculating a source domain subspace loss l by adopting and _S the same method calculates the embedded feature f of the target domain query set _θ (Q ^t ) Target domain subspace loss in target domain subspace/ _T ：

3.4.1 Embedding feature f into target domain support set _θ (S ^t ) Singular value decomposition is performed: wherein X_d Representing a target domain support set S ^t Class d, B _d Called left singular vector, ">Called right singular vectors, T represents the transpose, Σ _d Is a singular value matrix;

3.4.2 Selecting left singular vector B) _d Form a truncated matrix P from the first n dimensions of (1) _d ，P _d A base that can be considered a target domain subspace;

3.4.3 Calculating target domain query set embedding characteristics f _θ (Q ^t ) Sample q of (2) _l In the target domain subspace P _d Projection distance d of (2) _d (q _l )：

d _d (q _l )＝-||(I-M _d )(f _θ (q _l )-μ _d )|| ²

wherein μ _d Embedding feature f for target domain support set _θ (S ^t ) Average value of (2);

3.4.4 Calculating target domain query set embedding feature f using Softmax function _θ (Q ^t ) Sample q of (2) _l Probability of belonging to class d

3.4.5 Computing the u-th base P of the target domain subspace _u And v th group P _v Projection distance between

wherein Representation matrix->Sum of squares of all elements in (i) _F Representing the F norm;

3.4.6 Calculating the target domain subspace loss function l based on the results of (3.3.4) and (3.3.5) _D ：

wherein M^t Query set Q for target domain ^t Lambda of the number of samples of (1) ^t For adjustable parameters according to experimental setup, λ is selected in this example but not limited to ^t ＝0.03。

3.5 Using and calculating source domain subspace loss l _S The same method calculates the embedded feature f of the target domain support set _θ (S ^t ) Target domain support loss in target domain subspace

3.5.1 Computing target domain support set embedding feature f _θ (S ^t ) Samples of (3)In the target domain subspace P _d Projection distance +.>

3.5.2 Computing target domain support set embedding feature f using Softmax function _θ (S ^t ) Samples of (3)Probability of belonging to class d->

3.5.3 Calculating target domain support loss based on the results of (3.5.4) and (3.5.5)

wherein Support set S for target domain ^t Sample number of>For the adjustable parameters according to the experimental setup, in this example, but not limited to +.>

And 4, performing contrast learning on the source domain support set data.

The contrast learning is a self-supervision learning method, and can further improve the classification precision by making a model according to the characteristics of similarity and diversity Xi Gaojie between data under the condition of no label, and the implementation steps comprise the following steps:

4.1 Construction of a contrast set for support set contrast learning, i.e. a source domain support set S containing K samples of class C, each class ^s Dividing into K groups, each group having C samples, wherein any two groups can be considered as a group of comparison groups; in the comparison group, each sample has only one positive sample from the same class, the other 2C-2 samples are negative samples, and the two comparison groups can be respectively expressed as:

S ₁ ＝{x ₁ ,x ₃ ,...,x _2k -1,...,x _2C-1 }

S ₂ ＝{x ₂ ,x ₄ ,...,x _2k ,...,x _2C }

wherein x_2k-1 And x _2k Belongs to the same class;

4.2 Inputting the contrast groups into the feature extractor f respectively _θ Obtaining embedded features of the comparison set and />

4.3 Calculating embedded features of a comparison set and />Noise contrast estimation loss function of medium element _i,j ：

wherein s(f _θ (x _i ),f _θ (x _j ) (d) represents f _θ (x _i ) And f _θ (x _j ) Similarity measure between, τ is a temperature coefficient, selected in this example but not limited to τ=0.5;

4.4 Calculating the final contrast loss l _CL ：

wherein l_2k-1,2k Representation ofAnd->Noise contrast estimation loss function of medium element, l _2k,2k-1 Representation->And->The noise contrast of the medium element estimates the loss function.

And 5, constructing a domain discriminator.

Because the source domain data and the target domain data have great differences, how to solve the domain offset and realize the domain alignment are key problems to be solved urgently. The current common method is transfer learning, namely training a model in source domain data and then transferring to target domain data for classification, but the method needs fine adjustment and comprises complex setting and adjustment; the example adopts a domain discriminator based on the generation-countermeasure idea to project the fusion characteristics of the source domain and the target domain into a public space to form an integral model so as to realize domain alignment, thereby reducing training cost;

referring to fig. 3, the implementation of this step includes the following:

5.1 Five full-connection layers are selected, wherein the input characteristic quantity of the first four full-connection layers is 1024, and the output characteristic quantity is 1024; the fifth full connection layer has 1024 input features and 1 output features;

5.2 Four identical ReLU nonlinear active layers are selected, and the function of the four identical ReLU nonlinear active layers is ReLU (x) =max (x, 0);

5.3 Four identical dropout layers are selected, each layer being used to randomly set 50% of the input data to 0;

5.4 A first full connection layer, a first ReLU nonlinear activation layer and a first dropout layer are sequentially connected to form a first full connection block; sequentially connecting the second full-connection layer, the second ReLU nonlinear activation layer and the second dropout layer to form a second full-connection block; sequentially connecting a third full connection layer, a third ReLU nonlinear activation layer and a third dropout layer to form a third full connection block; sequentially connecting a fourth full connection layer, a fourth ReLU nonlinear activation layer and a fourth dropout layer to form a fourth full connection block;

5.5 A Softmax activation function selected as a function of wherein x_i Representing the element in input x, Σ _j exp(x _i ) Representing a normalization term;

5.6 A first full connecting block, a second full connecting block, a third full connecting block, a fourth full connecting block and a fifth full connecting layer, and the Softmax activation layer is sequentially connected to form a domain discriminator f _D 。

Step 6, calculating the domain discrimination loss l of the domain discriminator _D 。

6.1 Loss of source domain subspace/ _S Loss of source domain supportSource domain support set embedding feature f _θ (S ^s ) And source domain query set embedding feature f _θ (Q ^s ) Fusion is performed to obtain a source domain fusion feature (T (h ^s ))；

6.1.1 Using Softmax function to loss of source domain subspace/l _S Conversion into Source Domain query results g ^s1 ：

wherein Representation l _S Element of (a)>Representing a normalization term;

6.1.2 Source domain support loss using Softmax functionConversion to Source Domain support results g ^s2 ：

wherein Representation->Element of (a)>Representing a normalization term;

6.1.3 Supporting result g by source domain ^s2 And source domain query result g ^s1 Output result g for source domain connected in row direction ^s ；

6.1.4 Embedding the source domain support set into the feature f) _θ (S ^s ) And source domain query set embedding feature f _θ (Q ^s ) Connected in the row direction as source domain output features f ^s ；

6.1.5 Using multi-linear mapping to output result g to source domain ^s And source domain output feature f ^s Fusion is performed to obtain a source domain fusion feature (T (h ^s ))：

wherein , and />Respectively f ^s and g^s Dimension of->Represents the pair f ^s and g^s Carrying out outer accumulation;indicating random sampling, wherein +.is the element product +.> and />Representing two random matrices that are sampled only once and then fixed from changing during the training phase;

6.2 Loss of target domain subspace/ _T Target domain support set embedding feature f _θ (S ^t ) Target domain query set embedding feature f _θ (Q ^t ) Fusion is performed to obtain a target domain fusion feature (T (h) ^t ))：

6.2.1 Using Softmax function to lose/lose target domain subspace _D Conversion into target Domain query results g ^t1 ：

wherein Representation l _S Element of (a)>Representing a normalization term;

6.2.2 Using Softmax function to lose target domain supportConversion to target Domain support results g ^t2 ：

wherein Representation->Element of (a)>Representing a normalization term;

6.2.3 Supporting the result g by the target domain ^t2 And target domain query result g ^t1 Output result g for target domain connected in row direction ^t ；

6.2.4 Embedding the target domain support set into the feature f) _θ (S ^t ) And target domain query set embedding feature f _θ (Q ^t ) Connecting in the row direction as target domain output feature f ^t ；

6.2.5 Using multi-linear mapping to output result g from target domain ^t And a target domain output feature f ^t Fusion into target Domain fusion features (T (h) ^t ))：

wherein , and />Respectively f ^t and g^t Dimension of->Represents the pair f ^t and g^t Carrying out outer accumulation;indicating random sampling, wherein +.is the product of the elements, and />Representing two random matrices, sampled only once, and then fixed for no longer changing during the training phase.

6.3 Source domain fusion feature (T (h) ^s ) Fusion features with target domain (T (h) ^t ) Common input domain discriminator f) _D In the method, the domain discrimination loss l is calculated _D ：

Wherein D, T respectively represent the fusion characteristics of the domain discriminator and the characteristic extractor, and x ^s ，x ^t Representing a source domain sample and a target domain sample, respectively, P _s (x)，P _t (x) Representing the source domain distribution and the target domain distribution, respectively, D (T (h ^s ) Representing a discriminator f _D Prediction source domain fusion features (T (h) ^s ) A) the probability of originating from the source domain,representation discriminator f _D Predicting target domain fusion featuresThe probability derived from the target domain, E, is the desired output.

And 7, calculating a total loss function of the source domain and the target domain, and training the feature extractor.

7.1 Calculating the subspace loss l from the source domain _S Target domain subspace loss l _T Loss of contrast l _CL Domain discrimination loss l _D Total loss function of composition: l=λ ₁ l _S +λ ₂ l _T +λ ₃ l _CL +l _D

wherein λ₁ 、λ ₂ 、λ ₃ Respectively representing source domain subspace loss l _S Target domain subspace loss l _T Loss of contrast l _CL Domain discrimination loss l _D The values of which are set according to experiments, in this example are selected from but not limited to lambda ₁ :λ ₂ :λ ₃ ＝2:0.1:1；

7.2 Calculating gradients of residual block and convolutional layer parameters in the feature extractor; calculating the parameter gradient of the full connection layer in the domain discriminator;

7.3 Updating parameters of the feature extractor and domain discriminator using an optimization algorithm, in this example selected but not limited to adam optimization, based on the total loss/and the calculated gradients;

7.4 By iteratively performing the calculation of the total loss, the calculation of the gradient and the updating of the parameters, the feature extractor and the domain discriminator are continuously adjusted to obtain an updated feature extractor f _θ ' and updated domain authentication f _D '；

7.5 Using the updated feature extractor f once every E training algebra _θ ' categorize the test set data, in this example selected but not limited to e=500;

7.6 Repeating steps 3, 4, 6 and 7.1) to 7.4) until the total loss l tends to stabilize or no longer significantly decreases, resulting in a trained feature extractor f _θ '。

And 8, classifying by using the test set data.

8.1 Training set T) ^r Input to the mapping layer and trained feature extractor f _θ In' get training set embedding feature f _θ '(T ^r ) By utilizing the characteristic f _θ '(T ^r ) Building training set subspaces；

8.1.1 Embedding features f) into training set _θ '(T ^r ) Singular value decomposition is performed: f (f) _θ '(X _e )＝B _e Σ _e V _e ^T, wherein X_e Representing training set T ^r Class e, B of (2) _e Called left singular vectors, V _e ^T Called right singular vectors, T represents the transpose, Σ _e Is a singular value matrix;

8.1.2 Selecting left singular vector B) _e Form a truncated matrix P from the first n dimensions of (1) _e And P is taken _e A base treated as a training set subspace; selected in this example but not limited to n=2;

8.2 To test set T) ^e Input to the mapping layer and trained feature extractor f _θ In' get test set embedded feature f _θ '(T ^e ) And calculates its test subspace loss in the training set subspace, l':

8.2.1 Calculating test set embedded features f _θ '(T ^e ) Sample q of (3) _m In training set subspace P _e Projection distance d of (2) _e (q _m )：

d _e (q _m )＝-||(I-M _e )(f _θ '(q _m )-μ _e )|| ²

wherein μ _e Embedding feature f for training set _θ '(T ^r ) Average value of (2);

8.2.2 Calculating test set embedding characteristics f by Softmax function _θ '(T ^e ) Sample q of (3) _m Probability belonging to class m

8.2.3 A) computing the training set subspaceRadical P _a And the b-th group P _b Projection distance between

8.2.4 Calculating a test set subspace loss l' from the results of steps 8.2.2) and 8.2.3):

wherein M' is the test set T ^r The number of samples, λ', is a parameter that is adjustable according to the experiment;

8.3 Taking the smallest category in the test subspace loss l' as the prediction category to finish the classification task.

The sequence numbers of the steps are for clearly illustrating the implementation scheme of the invention, and the sequence numbers are not limited.

The technical effects of the present invention will be further described with reference to simulation experiments.

1. Simulation conditions and content:

the running environment of the simulation experiment is: windows 10 professional 64-bit operating system, CPU is Intel (R) Core (TM) i5-12490F 3.00GHz, memory is 16GB, graphics card is NVIDIA GeForce RTX 3060, and the operating environment is Cuda11.8, python 3.8 and pytorch 2.0.0.

The hyperspectral image dataset used in the simulation experiment comprises a source domain dataset and a target domain dataset, wherein the source domain dataset and the target domain dataset are the same as the source domain dataset;

the source domain data set adopts a Chikusei data set acquired by a visible light/near infrared camera Hyperspec-VNIR-C in Chikusei in Japan;

the target domain data sets were the Indian pins data set collected on the Indian pins test site in northwest of Indiana and the Salinas data set collected over Salinas Valley in California, USA, respectively, and the Pavia unitary data set obtained by photographing at university of Parvia in North Italy using ROSIS hyperspectral remote sensing satellites.

The Chikusei dataset image size was 2517 x 2335 with 128 spectral bands, containing 19 classes of features, the class and number of each class of features as shown in Table 1.

The Indian pins dataset has an image size of 145 x 145 and 200 spectral bands, containing 16 types of features, and the type and number of each type of feature are shown in table 2.

The Pavia universality dataset image size is 610 x 340, with 103 spectral bands, containing 9 types of features, and the type and number of each type of feature are shown in table 3.

The Salinas dataset image size was 512 x 217 with 204 spectral bands, containing 16 classes of features, class and number of features per class, as shown in table 4.

TABLE 1 Chikusei sample categories and amounts

/>

TABLE 2 Indian pins sample class and quantity

Class label	Ground object category	Quantity of
			1	Alfalfa	46
2	Corn-notill	1428
			3	Corn-mintill	830
4	Corn	237
			5	Grass-pasture	483
6	Grass-trees	730
			7	Grass-pasture-mowed	28
8	Hay-windrowed	478
			9	Oats	20
10	Soybean-nottill	972
			11	Soybean-mintill	2455
12	Soybean-clean	593
			13	Wheat	205
14	Woods	1265
			15	Buildings-grass-trees-drives	386
16	Stone-steel-towers	93

Table 3 Pavia university sample class and quantity

Class label	Ground object category	Quantity of
			1	Asphalt	6631
2	Meadows	18649
			3	Gravel	2099
4	Trees	3064
			5	Sheets	1345
6	Baresoil	5029
			7	Bitumen	1330
8	Bricks	3682
			9	Shadows	947

TABLE 4 Salinas sample class and quantity

Class label	Ground object category	Quantity of
			1	Brocoli_green_weeds_1	2009
2	Brocoli_green_weeds_2	3726
			3	Fallow	1976
4	Fallow_rough_plow	1394
			5	Fallow_smooth	2678
6	Stubble	3959
			7	Celery	3579
8	Grapes_untrained	11271
			9	Soil_vinyard_develop	6203
10	Corn_senesced_green_weeds	3278
			11	Lettuce_romaine_4wk	1068
12	Lettuce_romaine_5wk	1927
			13	Lettuce_romaine_6wk	916
14	Lettuce_romaine_7wk	1070
			15	Vinyard_untrained	7268
16	Vinyard_vertical_trellis	1807

2. Simulation content and result analysis:

simulation 1. Classification was performed on hyperspectral images described in Table 2 using the present invention and three methods RN-FSC, DCFSL, gia-CFSL, respectively, and the respective overall classification accuracy OA, average classification accuracy AA and kappa coefficients were compared, and the results are shown in Table 5.

TABLE 5 comparison of the results of three prior art with the present invention on Indian pins

/>

Simulation 2. Classification was performed on hyperspectral images described in Table 3 using the present invention and the three methods of the prior art, RN-FSC, DCFSL, gia-CFSL, respectively, and the respective overall classification accuracy OA, average classification accuracy AA, and kappa coefficients were compared, and the results are shown in Table 6.

Table 6 results of prior art comparison with the present invention on Pavia universality

Simulation 3. Classification was performed on hyperspectral images described in Table 4 using the present invention and the three methods RN-FSC, DCFSL, gia-CFSL, respectively, and the respective overall classification accuracy OA, average classification accuracy AA and kappa coefficients were compared, and the results are shown in Table 7.

Table 7 results of three prior art comparisons with the present invention on Salinas

Simulation 4, the time required for the respective runs was compared with the classification of the hyperspectral images described in tables 2, 3 and 4 by the present invention and the three methods of the prior art RN-FSC, DCFSL, gia-CFSL, respectively, and the results are shown in table 8.

TABLE 8 comparison of the results of the invention with three prior art methods over the time(s) required for operation

The three prior art sources in tables 5, 6, 7, 8 are as follows:

the RN-FSC refers to a relational network-based method RN-FSC proposed by Gao et al in Deep relation network for hyperspectral image few-shot classification, remote sens.vol.12, no.6, p.923, mar.2020 ".

The DCFSL refers to a method DCFSL based on condition-based domain adaptation proposed by Li et al in Deep cross-domain few-shot learning for hyperspectral image classification, IEEE Trans. Geosci. Remote sens, vol.60, pp.1-18,2021.

The Gia-CFSL refers to a graph information aggregation-based method Gia-CFSL as proposed by Zhang et al in Graph information aggregation cross-domain few-shot learning for hyperspectral image classification, "IEEE Trans.Neal NetworkLearn. Syst., early access, jun.30, 2022".

It can be seen from tables 5, 6 and 7 that the overall classification accuracy and the average classification accuracy of the method of the present invention are higher than those of the existing method on the three data sets, and the consistency of the method is higher than that of the existing method except for Salinas.

As can be seen from table 8, the time required for classification of the present invention over three data sets is relatively low compared to the prior art.

The simulation result shows that the invention not only improves the distinguishing capability and generalization performance of the classifier, but also improves the utilization rate of embedded feature information through the subspace classification adopted. Meanwhile, the invention realizes domain alignment by using the domain discriminator to project the fusion characteristics of the source domain and the target domain to the public space, thereby reducing training cost. In addition, the method and the device train the source domain support set data by using contrast learning, so that the classification precision of the cross-domain small sample is further improved.

Claims

1. A small sample cross-domain hyperspectral image classification method based on contrast learning and subspace is characterized by comprising the following steps:

(1) Acquiring a hyperspectral data set and dividing:

(6) Calculating a domain discrimination penalty l _D ：

Wherein D, T respectively represent the fusion characteristics of the domain discriminator and the characteristic extractor, and x ^s ，x ^t Representing a source domain sample and a target domain sample, respectively, P _s (x)，P _t (x) Representing the source domain distribution and the target domain distribution, respectively, D (T (h ^s ) Representing a discriminator f _D Prediction source domain fusion features (T (h) ^s ) A) the probability of originating from the source domain,representation discriminator f _D Prediction target Domain fusion feature->The probability from the target domain, E, is the expected output;

(7) Computing a total loss function to train the feature extractor:

(8) Classifying by using test set data:

2. The method according to claim 1, characterized in that: in the step (2), a two-dimensional convolution layer and a BatchNorm2d layer in the mapping layer are formed, and the structural parameters are as follows:

the two-dimensional convolution layer has the input channel number of the dimension of input data, the output channel number of 100, the convolution kernel size of 1 multiplied by 1 and the step length of 1;

the BatchNorm2d layer is used for two-dimensional batch normalization, and the formula isWhere x represents the input data, mean (x) represents the mean, var (x) represents the variance, eps is a minimum, and γ, β are two different values of the adjustment parameters, respectively.

3. The method according to claim 1, characterized in that: the feature extractor f is constituted in step (2) _θ Two residual blocks, two max-pooling layers, one convolution layer, its structureThe parameters are as follows:

the feature extractor f _θ The device is formed by sequentially connecting a first residual block, a first pooling layer, a second residual block, a second pooling layer and a convolution layer;

the two residual blocks have the same structure, and each residual block comprises 3 convolution blocks and 3 ReLU layers: the first convolution block, the first ReLU layer, the second convolution block, the second ReLU layer, the third convolution block and the third ReLU layer are sequentially connected, and the output of the first ReLU layer is connected with the third convolution block in a residual way; each convolution block comprises a 3D convolution layer and a battnorm 3D layer; the convolution kernel of the 3D convolution layer has the size of 3 multiplied by 3, the step length is 1, and the filling is 1; the function of each ReLU layer is ReLU (x) =max (x, 0);

The two largest pooling layers have different parameters, namely the pooling core size of the first pooling layer is 4 multiplied by 2, the filling is 0 multiplied by 1, and the step length is 4 multiplied by 2; the size of the pooling core of the second pooling layer is 4 multiplied by 2, the filling is 4 multiplied by 2, and the step size is 2 multiplied by 1;

the convolution layer has an output channel number of 32, convolution kernel size is 3× 3X 3.

4. The method according to claim 1, characterized in that: embedding feature f in step (3 b) using source domain support set _θ (S ^s ) Constructing a source domain subspace and calculating a source domain query set embedding feature f _θ (Q ^s ) Source domain subspace loss in source domain subspace/ _S The method comprises the following steps:

(3b1) Embedding feature f into source domain support set _θ (S ^s ) Singular value decomposition wherein X_c Representing a source domain support set S ^s Class c, B _c Called left singular vector, ">Called right singular vectors, T represents the transpose, Σ _c Is a singular value matrix;

(3b2) Selecting left singular vectors B _c Form a truncated matrix P from the first n dimensions of (1) _c And P is taken _c A base regarded as a source domain subspace;

(3b3) Computing source domain query set embedding features f _θ (Q ^s ) Sample q of (3) _k In the source domain subspace P _c Projection distance d of (2) _c (q _k )：

d _c (q _k )＝-||(I-M _c )(f _θ (q _k )-μ _c )|| ²

wherein μ _c Embedding features S for source domain support sets ^s Average value of (2);

(3b4) Computing source domain query set embedding feature f using Softmax function _θ (Q ^s ) Sample q of (2) _k Probability of belonging to class c

(3b5) Computing the ith base P of the source domain subspace _i And the j-th base P _j Projection distance between

(3b6) Calculating a source domain subspace loss function l from the results of (3 b 4) and (3 b 5) _S ：

wherein M^s Query set Q for source domain ^s Number of samples, lambda ^s Is a parameter that can be adjusted according to the experiment.

5. The method according to claim 1, characterized in that: in step (4), the source domain support set S ^s Dividing into C contrast groups, and calculating contrast loss l of the input mapping layers and the feature extractor of the contrast groups _CL The method comprises the following steps:

(4a) Source domain support set S containing class C and K samples of each class ^s Dividing into K groups, each group having C samples, wherein any two groups can be regarded as a comparison group, in which each sample has only one positive sample from the same class, the remaining 2C-2 samples are negative samples, and the two comparison groups can be respectively expressed as S ₁ and S₂ ：

S ₁ ＝{x ₁ ,x ₃ ,...,x _2k-1 ,...,x _2C-1 }

S ₂ ＝{x ₂ ,x ₄ ,...,x _2k ,...,x _2C }

wherein x_2k-1 And x _2k Belongs to the same class;

(4b) The contrast groups are respectively input into a feature extractor f _θ Obtaining embedded features of the comparison set and />

(4c) Computing embedded features for a comparison set and />Noise contrast estimation loss function of medium element _i,j ：

wherein s(f _θ (x _i ),f _θ (x _j ) (d) represents f _θ (x _i ) And f _θ (x _j ) Similarity measurement between the two, τ is a temperature coefficient;

(4d) Calculate the final contrast loss l _CL ：

6. The method according to claim 1, characterized in that: step (5) constituting a domain discriminator f _D Four sulu nonlinear active layers, four dropout layers, one Softmax active layer, the structural parameters of which include the following:

the domain discriminator f _D The device comprises a first full connecting block, a second full connecting block, a third full connecting block, a fourth full connecting block and a fifth full connecting layer, wherein the Softmax active layer is sequentially connected, and the four full connecting blocks are sequentially connected by the full connecting layer, the ReLU nonlinear active layer and the dropout layer;

the five full-connection layers have 1024 input characteristic numbers and 1024 output characteristic numbers; the fifth full connection layer has 1024 input features and 1 output features;

the four ReLU nonlinear active layers are identical, and the function of the four ReLU nonlinear active layers is ReLU (x) =max (x, 0);

the four dropout layers are identical and represent that part of the input features are randomly set to 0 and the parameters are set to 0.5;

The Softmax activation function is as follows wherein x_i Representing the element in input x, Σ _j exp(x _i ) Representing the normalized term.

7. The method according to claim 1, characterized in that: step (6 a) of losing the subspace of the source domain _S Source domain support set embedding feature f _θ (S ^s ) Source domain query set embedding feature f _θ (Q ^s ) Fusing to obtain source domain fusion characteristicsSign (T (h) ^s ) The method comprises the following steps:

(6a1) Source domain subspace loss/using a Softmax function _S Conversion into Source Domain query results g ^s1 ：

wherein Representation l _S Element of (a)>Representing a normalization term;

(6a2) Embedding source domain support sets into feature f _θ (S ^s ) Projection to source domain subspace to calculate loss, and calculation to obtain source domain support result g by using Softmax function ^s2 ；

(6a3) Supporting the result g by the source domain ^s2 Query result g with source domain ^s1 Output result g for source domain connected in row direction ^s ；

(6a4) Embedding source domain support sets into feature f _θ (S ^s ) Source domain query set embedding feature f _θ (Q ^s ) Connected in the row direction as source domain output features f ^s ；

(6a5) Outputting result g to source domain by adopting multi-linear mapping ^s And source domain output feature f ^s Fusion is performed to obtain a source domain fusion feature (T (h) ^s ))：

wherein , and />Respectively f ^s and g^s Dimension of->Represents the pair f ^s and g^s Carrying out outer accumulation;indicating random sampling, wherein +.is the product of the elements, and />Representing two random matrices that are sampled only once and then fixed from changing during the training phase.

8. The method according to claim 1, characterized in that: loss of target domain subspace/in step (6 b) _T Target domain support set embedding feature f _θ (S ^t ) Target domain query set embedding feature f _θ (Q ^t ) Fusion is performed to obtain a target domain fusion feature (T (h) ^t ) The steps are as follows:

(6b1) Target domain subspace loss/using a Softmax function _D Conversion into target Domain query results g ^t1 ：

wherein Representation l _S Element of (a)>Representing a normalization term;

(6b2) Embedding a target domain support set into feature f _θ (S ^t ) Projecting the target domain support result g to a target domain subspace to calculate loss, and calculating the target domain support result g by using a Softmax function ^t2 ；

(6b3) Supporting the result g by the target domain ^t2 Query result g with target domain ^t1 Output result g for target domain connected in row direction ^t ；

(6b4) Embedding a target domain support set into feature f _θ (S ^t ) Target domain query set embedding feature f _θ (Q ^t ) Connecting in the row direction as target domain output feature f ^t ；

(6b5) Outputting the result g from the target domain by adopting multi-linear mapping ^t And a target domain output feature f ^t Fusion into target Domain fusion features (T (h) ^t ))：

wherein , and />Respectively f ^t and g^t Dimension of->Represents the pair f ^t and g^t Carrying out outer accumulation; Indicating random sampling, wherein +.is the product of the elements, and />Representing two random matrices, sampled only once, and then fixed for no longer changing during the training phase.

9. The method according to claim 1, characterized in that: in step (8 a) training set feature f is used _θ '(T ^r ) The training set subspace is constructed, and the steps comprise the following steps:

(8a1) Embedding feature f into training set _θ '(T ^r ) Singular value decomposition is performed: f (f) _θ '(X _e )＝B _e Σ _e V _e ^T, wherein X_e Representing training set T ^r Class e, B of (2) _e Called left singular vectors, V _e ^T Called right singular vectors, T represents the transpose, Σ _e Is a singular value matrix;

(8a2) Selecting left singular vectors B _e Form a truncated matrix P from the first n dimensions of (1) _e And P is taken _e Treated as the basis for a training set subspace.

10. The method according to claim 1, characterized in that: calculating test set embedding characteristics f in step (8 b) _θ '(T ^e ) A test set subspace loss l' in the training set subspace, comprising the steps of:

(8b1) Computing test set embedding characteristics f _θ '(T ^e ) Sample q of (3) _m In training set subspace P _e Projection distance d of (2) _e (q _m )：

d _e (q _m )＝-||(I-M _e )(f _θ '(q _m )-μ _e )|| ²

(8b2) Calculation of test set embedding characteristics f by Softmax function _θ '(T ^e ) Sample q of (3) _m Probability belonging to class m

(8b3) Computing the a-th base P of the training set subspace _a And the b-th group P _b Projection distance between

(8b4) Calculating a test set subspace loss l' from the results of (8 b 2) and (8 b 3):

wherein M' is the test set T ^r The number of samples, λ', is an experimentally adjustable parameter.