CN115471739A

CN115471739A - Cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning

Info

Publication number: CN115471739A
Application number: CN202210927707.1A
Authority: CN
Inventors: 王思远; 侯东阳
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2022-08-03
Filing date: 2022-08-03
Publication date: 2022-12-13

Abstract

The invention relates to a cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning, which comprises the following steps: a) Acquiring a remote sensing image and constructing input data; b) Constructing a loss function based on self-supervision contrast learning and combining a known sample and an unknown sample of a target domain image, constructing a depth domain adaptive learning network, and training the depth domain adaptive learning network by using input data and the loss function; c) Classifying the target domain images by using a depth domain adaptive learning network, extracting target image feature vectors of the target domain images to construct a feature database, extracting query image feature vectors of the target domain query images, calculating Euclidean distances between the query image feature vectors and the target image feature vectors in the feature database, and selecting a required retrieval target based on the Euclidean distances. The cross-domain remote sensing scene classification and retrieval method based on the self-supervision contrast learning can still have good cross-domain classification and retrieval precision under the condition that unknown classes exist in the target domain.

Description

Cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning

Technical Field

The invention relates to the technical field of optical remote sensing image retrieval, in particular to a cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning.

Background

In recent years, the progress of earth observation technology provides more and more high-resolution remote sensing images for human beings, brings huge opportunities for the remote sensing field, and greatly promotes the application of the remote sensing images in different fields. The remote sensing image scene classification and retrieval are basic tasks in the field of remote sensing image interpretation, can quickly and accurately understand and manage remote sensing image data, and play an important role in the fields of environment monitoring, land utilization, visual navigation and the like.

The deep Convolutional Neural Networks (CNNs) developed in recent years have strong feature fitting capability, and exhibit great superiority in remote sensing scene classification and retrieval tasks. The general process is that firstly, a main network pre-trained by a general large-scale image data set (such as ImageNet) is finely adjusted on a remote sensing image data set, and then the activation output of the network is extracted to be used as an image feature representation for retrieval or classification.

However, most of the existing CNN-based methods are supervised, and it is usually assumed that the training set and the test set share the same data distribution, and in practical applications, due to the difference of imaging conditions such as sensors, shooting angles, shooting weather, etc., the feature of the same type has a huge difference in different data distributions, which is called data migration, when there is data migration between the training set and the test set, the generalization effect of the model on the test set is poor, and re-labeling the test set is time-consuming, labor-consuming and impractical; in addition, most of existing domain adaptation remote sensing scene classification or retrieval methods are proposed for closed set scenes, that is, a target domain and a source domain share the same label space, and in a complex actual scene, the assumption is easily violated because the class of the source domain is often incomplete, the source domain cannot cover all classes, and the target domain may contain class samples which are not shared by the source domain.

In view of the above, it is necessary to design a cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning.

Disclosure of Invention

The invention aims to provide a cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning, which can still have good classification and retrieval precision of a target domain under the condition that the target domain has unknown classes.

In order to solve the technical problem, the invention provides a cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning, which comprises the following steps:

a) Acquiring a remote sensing image, and dividing a source domain image and a target domain image of the remote sensing image to construct input data;

b) Constructing a loss function based on self-supervision contrast learning and combined with the known class and the unknown class of the target domain image, constructing a depth domain adaptive learning network, and training the depth domain adaptive learning network by using the input data and the loss function;

c) Classifying the target domain images by using the trained depth domain adaptive learning network, extracting target image feature vectors of the target domain images to construct a feature database, extracting query image feature vectors of the target domain query images, calculating Euclidean distances between the query image feature vectors and all the target image feature vectors in the feature database, arranging according to the Euclidean distances, and obtaining the required retrieval target according to a set Euclidean distance range.

Further, the step of constructing the input data comprises: extracting from the data set of the remote sensing imageA stem image {1,2, …, N }, and constructing the source domain image

The source domain image comprises n _s Image with label source domain

Representing annotated source domain images

The corresponding label, wherein,

representing the label space of the image with the labeling source domain, and C representing the total number of categories of the image with the labeling source domain; the target domain image is

The target domain image comprises n _t Label-free target domain image

Wherein the target domain image

The label space of (a) is: {1,2, …, C +1}, C +1 representing the unknown class of the label-free target domain image.

Further, the deep domain adaptive learning network comprises a plurality of feature coding networks f (-), a plurality of contrast learning networks g (-), and a plurality of classifiers c (-).

Further, the feature coding network f (-) is a depth residual network with a full connection layer removed, and an average pooling layer of the depth residual network is replaced by a bottleneck layer.

Further, the contrast learning network g (-) is a perceptron with a ReLU (modified linear unit) activation function.

Further, the classifier c (-) is a fully connected network, and the output dimension of the classifier c (-) is consistent with the class number of the target domain image.

Further, the constructing step of the loss function includes:

b11 Construct source domain classification loss: and carrying out supervised learning on the source domain image, and calculating the classification accuracy by adopting cross entropy loss:

wherein L is _softmax In order to classify the function of the loss,

a source domain annotated image representing the source domain image

True class distribution, function

A source domain weakly enhanced sample class probability distribution representing the classifier output,

a collection of label exemplars representing an annotated image in a source domain;

b12 Construct the self-supervised contrast loss: constructing a target domain strong enhancement sample of the target domain image

And target domain weakly enhanced samples

To calculate the contrast loss L _ssl ：

Wherein sim (-) is the similarity measure function, θ is the scaling factor, A ∈ {0,1} is an indication function for evaluating whether k equals j, B represents the number of samples selected by one training;

b13 Construct a known class identification penalty as:

where μ represents the proportion of samples within a training run that meet the selection requirements for a known class threshold, H (-) represents the cross entropy loss,

for weakly enhancing samples from the target domain

The collected set of the target domain known pseudo labels obtained through screening, ind represents the target domain weak enhancement sample

The category to which the known pseudo label belongs after screening, and ind epsilon {1,2, …, C },

representing strongly enhanced samples of the target domain

Is determined based on the predicted class probability distribution of (c),

a collection of labeled exemplars representing strongly enhanced exemplars of the target domain;

b14 Construct unknown class identification loss: consistency classification loss L of unknown class identification loss as high-confidence unknown class sample _unknown ：

Wherein, the first and the second end of the pipe are connected with each other,

for weakly enhancing samples from the target domain

Screening the obtained collection of the unknown pseudo labels of the target domain,

representing strongly enhanced samples of the target domain

A predicted class probability distribution of (a);

b15 ) the constructed total loss function L is:

L＝L _softmax +αL _ssl +βL _known +γL _unknown

where α, β and γ are parameters that balance the optimization objectives of the model.

Further, the target domain weakly enhances samples

By comparing the label-free target domain image

Obtaining the product by random cutting and turning; the target domain strongly enhanced sample

By comparing the label-free target domain image

Obtaining by using a random enhancement method; the source domain weakly enhances the sample by matching the annotated source domain image

And obtaining the target by random cutting and overturning.

Further, the training step of the deep domain adaptive learning network comprises:

b21 The source domain weakly enhanced sample and the target domain weakly enhanced sample

And target domain strongly enhanced samples

Inputting into the feature coding network f (-) to respectively obtain the source domain features

Target domain weakly enhanced image features

And strong enhancement of image features in the target domain

B22 Weakly enhancing the target domain image features

And the target domain strongly enhances image features

Inputting the contrast learning network g (-) to obtain the embedded characteristics of the projected target domain weak enhanced image

And strong enhancement of image embedding characteristics in the target domain

To calculate the contrast loss L _ssl ；

B23 Characterize the source domain

The target domain weakly enhances image features

And the target domain strongly enhances image features

Inputting the classifier c (-) to respectively obtain the source domain weakly enhanced sample class probability distribution predicted by the classifier

The target domain weakly enhanced sample class probability distribution

And the target domain strongly enhances the sample class probability distribution

B24 Weakly enhancing sample class probability distribution to the source domain

Based on the classification loss function L _softmax Calculating the classification loss of the source domain;

b25 Weakly enhancing sample class probability distribution to the target domain

Firstly, finding the category of the maximum prediction probability, comparing the probability value of the category with a preset predefined threshold tau, abandoning the samples smaller than tau, reserving the samples larger than tau as pseudo label samples, and taking the category of the maximum prediction probability as a known hard label, wherein the screening formula is as follows:

to represent

A category in which the maximum prediction probability that satisfies a threshold condition is located;

b26 Using the target domain weakly enhanced samples

Known pseudo-label

As strong enhancement samples of the corresponding target domain

Calculates the target domain strong enhancement samples

Said known class of (1) identifies a loss L _known ；

B27 Selecting the target domain weakly enhanced sample class probability distribution

And taking the sample with lower confidence level as a candidate unknown sample, wherein the specific selection formula is as follows:

wherein

For the preliminarily screened candidate unknown samples, t _l Selecting a threshold value for the candidate sample, selecting a sample predicted that the probability of the unknown class is higher than the set unknown class sample selection threshold value as an unknown class sample,

wherein

Is the candidate sample

Probability of prediction as unknown class, t _uk A threshold value is chosen for the unknown class sample,

unknown pseudo-label for target domain;

b28 With the target domain unknown class pseudo-tag

As a strongly enhanced sample of the target domain

Calculating a consistent classification loss L of the unknown class samples _unknown And obtaining the total loss function L, and updating the parameters of the deep domain adaptive learning network.

Further, the step of obtaining the retrieval target is:

c21 Extracting the query image feature vector based on the trained feature coding network;

c22 Computing Euclidean distances between the query image feature vector and each target image feature vector in the feature database one by one;

c23 According to the Euclidean distance between the target image feature vector and the query image feature vector, sorting the target image feature vectors to obtain the retrieval target corresponding to the target image feature vectors.

According to the technical scheme, in the cross-domain remote sensing scene classification and retrieval method of the self-supervision contrast learning, input data are constructed firstly and comprise data of a source domain image and data of a target domain image, wherein the data of the source domain image is marked data, the data of the target domain image is unmarked data, the constructed input data are correspondingly enhanced, then the enhanced data of the source domain image and the data of the target domain image are input into a corresponding feature coding network, an output result is compared with the input data, a loss function is constructed on the basis of the self-supervision contrast learning by combining a known class and an unknown class of the target domain image, so that network parameters of the feature coding network can be adjusted on the basis of the loss function, the influence of the unknown class samples existing in the target domain on the feature coding network can be reduced, and the trained depth domain adaptive learning network has a better effect when the data containing the unknown class samples are classified or retrieved.

Further advantages of the present invention, as well as the technical effects of preferred embodiments, are further described in the following detailed description.

Drawings

FIG. 1 is a flow chart of the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning according to the present invention;

FIG. 2 is a schematic diagram illustrating the principle of the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning according to the present invention;

FIG. 3 is a schematic diagram of a training process of a deep domain adaptive learning network in the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning according to the present invention;

FIG. 4 is a schematic diagram of a retrieval process in the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning according to the present invention;

FIG. 5 is a classification confusion matrix adapted to a counterdiscrimination domain in the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning according to the present invention;

FIG. 6 is a classification confusion matrix of batch singular value constraints in the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning according to the present invention;

FIG. 7 is a classification confusion matrix of a depth domain adaptive network in the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning according to the present invention;

FIG. 8 is a classification confusion matrix for a back propagation open set domain adaptation in the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning according to the present invention;

FIG. 9 is a classification confusion matrix of the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning.

Detailed Description

The following describes in detail embodiments of the present invention with reference to the drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.

As shown in fig. 1 and fig. 2, as an embodiment of the method for classifying and retrieving cross-domain remote sensing scenes based on self-supervised contrast learning provided by the present invention, the method includes the following steps:

b) Constructing a loss function based on self-supervision contrast learning and combining a known sample and an unknown sample of a target domain image, constructing a depth domain adaptive learning network, and training the depth domain adaptive learning network by using input data and the loss function;

c) Classifying the target domain images by using a trained depth domain adaptive learning network, extracting target image feature vectors of the target domain images to construct a feature database, extracting query image feature vectors of the target domain query images, calculating Euclidean distances between the query image feature vectors and all target image feature vectors in the feature database, arranging according to the Euclidean distances, and obtaining a required retrieval target according to a set Euclidean distance range.

Specifically, in an embodiment of the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning provided by the present invention, the input data construction step includes: extracting a plurality of images {1,2, …, N } from a data set of remote sensing images to construct source domain images

The source domain image contains n _s Image with label source domain

Representing annotated source domain images

The corresponding label, wherein,

c, representing the label space of the image with the labeling source domain, and representing the total number of categories of the image with the labeling source domain; the target domain image is

The target domain image contains n _t Non-annotated target domain image

Wherein the target domain image

Further, in an embodiment of the cross-domain remote sensing scene classification and retrieval method based on the self-supervision contrast learning provided by the present invention, as shown in fig. 3 and fig. 4, the deep domain adaptive learning network includes a plurality of feature coding networks f (·), a plurality of contrast learning networks g (·), and a plurality of classifiers c (·); the feature coding network is a depth residual error network with a full connection layer removed, the last average pooling layer is replaced by a bottleneck layer, and 256-dimensional feature vectors are output; the contrast learning network g (-) is a perceptron with a ReLU (modified Linear Unit) activation function; the classifier C (-) is a fully-connected network, the input of the classifier C (-) is a 256-dimensional feature vector, and the output dimension of the classifier C (-) is consistent with the number of classes of the target domain image (namely, probability distribution of {1,2, …, C, C +1} classes).

Further, in an embodiment of the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning provided by the present invention, the step of constructing the loss function includes:

b11 Construct source domain classification loss: because the source domain images are semantically labeled (namely, the source domain images are labeled), supervised learning can be carried out on the source domain images, and the classification accuracy is calculated by adopting cross entropy loss:

wherein L is _softmax In order to classify the function of the loss,

annotated image of source domain representing image of source domain

True class distribution, function of

a collection of labels representing source domain annotated images;

b12 Construct an unsupervised contrast loss: self-supervised contrast learning is to learn a representation by maximizing information between different views of the data, and in particular, by encouraging two views from the same image of the target domain (i.e., a strongly enhanced view and a weakly enhanced view) to be similar, and two views from different images to be dissimilar, to learn more discriminative image features, and thus, a strongly enhanced sample of the target domain that can construct an image of the target domain

And target domain weakly enhanced samples

To calculate the contrast loss L _ssl ：

Wherein sim (-) is the similarity measure function, θ is the scaling factor, A ∈ {0,1} is an indication function for evaluating whether k equals j, B represents the number of samples selected by one training; in particular, the target domain weakly enhances the samples

Is formed by the image of the unmarked target domain

Obtaining the product by random cutting and overturning; target domain strongly enhanced samples

Is formed by the image of the target domain without marking

Obtaining by using a random enhancement method;

b13 Construct the known class identification loss as:

wherein, mu represents the sample proportion meeting the selection requirement of the threshold value of the known class in one training, H (-) represents the cross entropy loss,

for weakly enhancing samples from the target domain

The collected set of the target domain known pseudo labels is obtained through screening, and ind represents a target domain weak enhancement sample

The category to which the pseudo label belongs after being screened by the known class, and ind belongs to {1,2, …, C },

representing strongly enhanced samples of a target domain

The probability distribution of the prediction classes of (a),

b14 Constructing unknown class identification loss: consistency classification loss L for unknown class identification loss as high confidence unknown class samples _unknown ：

Wherein the content of the first and second substances,

for weakly enhancing samples from the target domain

representing strongly enhanced samples of a target domain

A predicted class probability distribution of (a);

b15 The constructed total loss function L) is:

L＝L _softmax +αL _ssl +βL _known +γL _unknown

Further, in an embodiment of the cross-domain remote sensing scene classification and retrieval method based on the self-supervised contrast learning provided by the present invention, the training step of the deep-domain adaptive learning network includes:

b21 C) the feature coding network f (-) can be set to three, and the source domain weakly enhanced samples and the target domain weakly enhanced samples are set

And target domain strongly enhanced samples

Respectively input into corresponding feature coding networks f (-) to respectively obtain source domain features f _i ^s Target domain weakly enhanced image features

And strong enhancement of image features in the target domain

Wherein, the source domain weakly enhances the sample by the pair of the labeled source domain images

Obtaining the product by random cutting and turning;

b22 The contrast learning network g (-) can be set to two, and the target domain is weakly enhanced with the image characteristics

And strong enhancement of image features in the target domain

Respectively input into corresponding comparison learning networks g (-) to obtain the embedded characteristics of the target domain weakly enhanced images after projection in decibels

And strong enhancement of image embedding characteristics in the target domain

To calculate the contrast loss L _ssl ；

B23 C (-) can be set to three and source domain features fi _i ^s Target domain weakly enhanced image features

And strong enhancement of image characteristics in the target domain

Respectively input into corresponding classifiers c (-) to respectively obtain the source domain weakly enhanced sample class probability distribution predicted by the classifiers

Target domain weakly enhanced sample class probability distribution

Strongly enhancing sample class probability distribution with target domain

B24 Weakly enhancing sample class probability distribution to source domain

b25 Weakly enhancing sample class probability distribution to target domain

Firstly, finding out the probability distribution of the target domain weakly enhanced sample class

Then the probability value of the category is compared with a preset predefined threshold value tau, and samples smaller than the predefined threshold value tau are abandoned so as to keep samples larger than the predefined threshold value tauDefining a sample of a threshold value tau as a pseudo label sample, and taking a class in which the maximum prediction probability is located as a known class hard label, wherein a screening formula of the sample is as follows:

representing a target domain weakly enhanced sample class probability distribution

The category in which the maximum prediction probability that satisfies the threshold condition is located;

b26 Using target domain weakly enhanced samples

Known pseudo-label

Strongly enhanced samples as corresponding target domains

To calculate a target domain strong enhancement sample

Is known to identify the loss L _known ；

B27 Selects a target domain weakly enhanced sample class probability distribution

wherein

wherein

As candidate samples

Probability of prediction as unknown class, t _uk A threshold value is selected for the unknown class of samples,

unknown pseudo-label for target domain;

b28 With target domain unknown pseudo-tags

Strongly enhanced samples as target domains

Computing a consistent classification loss L of the unknown class sample _unknown And obtaining the total loss function L and updating the parameters of the depth domain adaptive learning network.

Further, in an embodiment of the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning provided by the invention, the step of acquiring the target domain image is as follows:

c21 Extracting a query image feature vector based on the trained feature coding network;

c22 Calculating Euclidean distances between the feature vectors of the query image and the feature vectors of each target image in the feature database one by one;

The construction of the input data and the training of the depth Domain adaptive learning Network are realized based on a Pythch library of a Python language, and in addition, simulation experiments of Domain adaptive methods such as ADDA (adaptive discrete Domain Adaptation), BSP (Batch singular value constraint), DAN (Deep Adaptation Network) and OSBP (Open Set Domain Adaptation) are also carried out for comparing with the cross-Domain remote sensing scene classification and retrieval method based on the self-supervision contrast learning of the invention; the invention adopts the overall classification precision and the classification confusion matrix to evaluate the classification effect, and adopts the Average Normalized Modified Retrieval Rank (ANMRR), the average retrieval precision (mAP) and PK (retrieval precision of the previous K images) to evaluate the retrieval effect, wherein the higher the average retrieval precision (mAP) and the PK values of the retrieval precision of the previous K images are, the better the retrieval performance is, the smaller the ANMRR value of the average normalized modified retrieval rank is, the better the retrieval performance is, and the comparison result is shown in table 1:

Method	accuracy of classification	ANMRR	mAP	P5	P10	P20	P50	P100
									ADDA	0.602	0.2872	0.5845	0.7770	0.7540	0.7215	0.6546	0.5280
BSP	0.616	0.2800	0.5928	0.8070	0.7675	0.7238	0.6498	0.5324
									DAN	0.6	0.2622	0.5997	0.7930	0.7695	0.7375	0.6658	0.5503
OSBP	0.6563	0.2725	0.5921	0.7260	0.7000	0.6880	0.6365	0.5403
									The invention	0.8063	0.2222	0.6777	0.8800	0.8635	0.8318	0.7630	0.6103

TABLE 1

The results in table 1 show that the cross-domain remote sensing scene classification and retrieval method based on the self-supervision comparison learning achieves the highest retrieval accuracy, compared with the comparison method, the classification accuracy of the method is improved by 15% to 20.63%, meanwhile, the retrieval accuracy also exceeds the comparison method comprehensively, specifically, the average retrieval accuracy of the method is improved by at least 7.8% compared with the comparison method, and the P5-P100 and ANMRR of the method are superior to the comparison method. In addition, fig. 5 to 9 also show different methods and the classification confusion matrix of the present invention, in which the numerical value on the diagonal line in the classification confusion matrix represents the probability of a correct classification of a certain class, and the numerical value outside the diagonal line represents the probability of an incorrect classification of other classes, and the results show that the method of the present invention effectively improves the classification accuracy of the target domain, particularly greatly improves the classification accuracy of the unknown class of the target domain, and simultaneously reduces the confusion between the unknown class and the known class. In conclusion, the cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning provided by the invention can effectively improve the cross-domain classification and retrieval effect under the condition of data distribution difference and inconsistent class space.

According to the technical scheme, in the cross-domain remote sensing scene classification and retrieval method of the self-supervision contrast learning, input data are constructed firstly, wherein the input data comprise data of a source domain image and data of a target domain image, the data of the source domain image are labeled data, the data of the target domain image are unlabeled, the constructed input data are correspondingly enhanced, then, the enhanced data of the source domain image and the data of the target domain image are input into a corresponding feature coding network, an output result is compared with the input data, a loss function is constructed by combining a known class and an unknown class of the target domain image on the basis of the self-supervision contrast learning, so that network parameters of the feature coding network can be adjusted on the basis of the loss function, the influence of the unknown class samples existing in the target domain on the feature coding network can be reduced, and the trained depth domain adaptive learning network has a better effect when the data containing the unknown class samples are classified or retrieved.

The preferred embodiments of the present invention have been described in detail above with reference to the accompanying drawings, but the present invention is not limited thereto. Within the scope of the technical idea of the invention, numerous simple modifications can be made to the technical solution of the invention, including combinations of the specific features in any suitable way, and the invention will not be further described in relation to the various possible combinations in order to avoid unnecessary repetition. Such simple modifications and combinations should also be considered as disclosed in the present invention, and all such modifications and combinations are intended to be included within the scope of the present invention.

Claims

1. The cross-domain remote sensing scene classification and retrieval method based on the self-supervision contrast learning is characterized by comprising the following steps of:

b) Constructing a loss function based on self-supervision contrast learning and combined with a known sample and an unknown sample of the target domain image, constructing a depth domain adaptive learning network, and training the depth domain adaptive learning network by using the input data and the loss function;

2. The cross-domain remote sensing scene classification and retrieval method based on the self-supervision contrast learning according to claim 1, characterized in that the construction steps of the input data comprise: extracting a plurality of images {1,2,. And N } from the data set of the remote sensing image, and constructing the source domain image

The source domain image comprises ns marked source domain images

Representing annotated source domain images

The corresponding label, wherein,

The target domain image comprises nt unmarked target domain images

Wherein the target domain image

The label space of (a) is: {1,2, ·, C +1}, C +1 denotes the unknown class of the label-free target domain image.

3. The cross-domain remote sensing scene classification and retrieval method based on the self-supervision comparison learning according to claim 2, characterized in that the deep domain adaptive learning network comprises a plurality of feature coding networks f (-) and a plurality of comparison learning networks g (-) and a plurality of classifiers c (-).

4. The cross-domain remote sensing scene classification and retrieval method based on the self-supervision contrast learning according to claim 3, characterized in that the feature coding network f (-) is a depth residual network with a full connection layer removed, and an average pooling layer of the depth residual network is replaced by a bottleneck layer.

5. The cross-domain remote sensing scene classification and retrieval method based on the self-supervision comparison learning according to claim 4, characterized in that the comparison learning network g (-) is a perceptron with a ReLU activation function.

6. The cross-domain remote sensing scene classification and retrieval method based on the self-supervision contrast learning of claim 5 is characterized in that the classifier c (-) is a full-connection network, and the output dimension of the classifier c (-) is consistent with the category number of the target domain images.

7. The cross-domain remote sensing scene classification and retrieval method based on the self-supervision contrast learning according to claim 6, characterized in that the construction step of the loss function comprises:

wherein L is _softmax In order to classify the function of the loss,

a source domain annotated image representing the source domain image

True class distribution, function

a collection of labeled exemplars representing annotated images in the source domain;

b12 Construct an unsupervised contrast loss: constructing a target domain strong enhancement sample of the target domain image

And target domain weakly enhanced samples

To calculate the contrast loss L _ssl ：

b13 Construct a known class identification penalty as:

for weakly enhancing samples from the target domain

The class to which the known pseudo-label belongs after being screened, and ind ∈ {1,2.

Representing strongly enhanced samples of the target domain

Is determined based on the predicted class probability distribution of (c),

Wherein the content of the first and second substances,

for weakly enhancing samples from the target domain

representing strongly enhanced samples of the target domain

A predicted class probability distribution of (a);

b15 ) the constructed total loss function L is:

L＝L _softmax +αL _ssl +βL _known +γL _unknown

8. The cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning according to claim 7, characterized in that the target domain weakly-enhanced samples

By comparing the label-free target domain image

Obtaining the product by random cutting and overturning; the target domain strongly enhanced sample

By comparing the label-free target domain image

Obtaining by using a random enhancement method; the source domain weakly enhanced sample is selected from the labeled source domain image

And obtaining the target by random cutting and overturning.

9. The cross-domain remote sensing scene classification and retrieval method based on the self-supervision contrast learning according to claim 8, wherein the training step of the deep domain adaptive learning network comprises:

b21 The source domain weakly enhanced samples and the target domain weakly enhanced samples

And target domain strongly enhanced samples

Inputting the characteristics into the characteristic coding network f (-) to respectively obtain the source domain characteristics f _i ^s Target domain weakly enhanced image features f _j ^w And strong enhancement of image features in the target domain

B22 Weakly enhancing the target domain image features

And the target domain strongly enhances image features

Inputting the contrast learning network g (-) to obtain the embedded characteristics of the projected target domain weak enhancement image

And strong enhancement of image embedding characteristics in the target domain

To calculate the contrast loss Lssl;

b23 Characterize the source domain

The target domain weakly enhances image features

And the target domain strongly enhances image features

The target domain weakly enhanced sample class probability distribution

B24 Weakly enhancing sample class probability distribution to the source domain

b25 Weakly enhancing sample class probability distribution to the target domain

Firstly, the category where the maximum prediction probability is located is found, the probability value of the category is compared with a preset predefined threshold value sigma, and the category which is smaller than tau is abandonedAnd (3) reserving samples larger than tau as pseudo label samples, and taking the class where the maximum prediction probability is as a known class hard label, wherein the screening formula is as follows:

wherein the content of the first and second substances,

to represent

b26 Using the target domain weakly enhanced samples

Known pseudo-label

Strongly enhancing samples as the corresponding target domain

Calculates the target domain strong enhancement samples

Said known class of (1) identifies a loss L _known ；

wherein

wherein

Is the candidate sample

unknown pseudo-label for target domain;

b28 With the target domain unknown class pseudo-tag

As a strongly enhanced sample of the target domain

Computing a consistent classification loss L of the unknown class samples _unknown And obtaining the total loss function L, and updating the parameters of the depth domain adaptive learning network by using a gradient descent algorithm.

10. The cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning according to claim 9, characterized in that the step of obtaining the retrieval target is: