CN115292532A - Remote sensing image domain adaptive retrieval method based on pseudo label consistency learning - Google Patents

Remote sensing image domain adaptive retrieval method based on pseudo label consistency learning Download PDF

Info

Publication number
CN115292532A
CN115292532A CN202210729817.7A CN202210729817A CN115292532A CN 115292532 A CN115292532 A CN 115292532A CN 202210729817 A CN202210729817 A CN 202210729817A CN 115292532 A CN115292532 A CN 115292532A
Authority
CN
China
Prior art keywords
remote sensing
samples
target domain
sensing image
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210729817.7A
Other languages
Chinese (zh)
Other versions
CN115292532B (en
Inventor
侯东阳
王思远
田雪晴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202210729817.7A priority Critical patent/CN115292532B/en
Publication of CN115292532A publication Critical patent/CN115292532A/en
Application granted granted Critical
Publication of CN115292532B publication Critical patent/CN115292532B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/532Query formulation, e.g. graphical querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Library & Information Science (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a remote sensing image domain adaptive retrieval method based on pseudo label consistency learning, which comprises the following steps: a) Acquiring a remote sensing image; b) Constructing input data, a triple convolution neural network and a loss function; c) And performing optimization training on the triple convolution neural network by using input data and combining the loss function, extracting the remote sensing image characteristic vector of the remote sensing image in the target domain by using the trained triple convolution neural network to form a characteristic library vector, extracting the query characteristic vector queried by the user, and comparing the query characteristic vector with the remote sensing image characteristic vector in the characteristic library vector to obtain the remote sensing image characteristic vector in the set similarity rank. The remote sensing image domain adaptive retrieval method based on pseudo label consistency learning is less influenced by the distribution difference of the target domain and the source domain, and has a good retrieval effect.

Description

Remote sensing image domain adaptive retrieval method based on pseudo label consistency learning
Technical Field
The invention relates to the technical field of optical remote sensing image retrieval, in particular to a remote sensing image domain adaptive retrieval method based on pseudo label consistency learning.
Background
In recent years, the remote sensing image is easier to obtain due to the development of earth observation technology, and mass remote sensing image data create favorable conditions for the application in the fields of earth surface coverage classification, disaster assessment, environment monitoring, city planning and the like.
However, in the growing remote sensing image data, how to efficiently find the interested target or scene becomes a difficult problem. Therefore, remote sensing image retrieval receives more and more attention as a key technology for mining effective information from large-scale remote sensing data.
At present, the retrieval model based on the deep neural network obtains the most competitive retrieval effect in remote sensing image retrieval. Since deep learning is a data-driven algorithm, these models are trained using a large amount of labeled data in order to achieve a better retrieval result. However, the explosive growth of remote sensing images poses a serious challenge to the data labeling work, which not only requires a large amount of manpower and material resources, but also is impractical to label all the images. Therefore, how to utilize the existing labeled remote sensing image and improve the retrieval precision of the model on the unlabeled data is a key problem to be solved. However, the retrieval effect of directly transferring the trained retrieval model to the unmarked data set is not ideal due to the difference of factors such as sensors, shooting angles, shooting weather and seasons among different data sets, and in the prior art, a classifier is generally learned by using labeled source domain data, and the classifier learned by the source domain is used for a target domain through feature alignment. However, subject to the difference in the distribution of the target domain and the source domain, the effect of the prior class information of the source domain on the target domain decision boundary is limited, which may result in the decision boundary learned by the source domain not being able to distinguish the target domain.
In view of the above, a remote sensing image domain adaptive retrieval method based on pseudo label consistency learning needs to be designed.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a remote sensing image domain adaptive retrieval method based on pseudo label consistency learning, which is little influenced by the distribution difference of a target domain and a source domain and has good retrieval effect.
In order to solve the technical problem, the invention provides a remote sensing image domain adaptive retrieval method based on pseudo label consistency learning, which comprises the following steps:
a) Acquiring a remote sensing image;
b) Constructing input data, a triple convolution neural network and a loss function;
c) And performing optimization training on the triple convolutional neural network by using the input data and combining the loss function, extracting the remote sensing image feature vector of the remote sensing image in the target domain by using the trained triple convolutional neural network to form a feature library vector, extracting a query feature vector queried by a user, and comparing the query feature vector with the remote sensing image feature vector in the feature library vector to obtain the remote sensing image feature vector in a set similarity rank.
Further, the step of constructing the input data comprises:
b11 Constructed to contain n s Source domain data of individual samples
Figure BDA0003712713300000021
And comprises n t Target domain data of individual samples
Figure BDA0003712713300000022
Wherein the content of the first and second substances,
Figure BDA0003712713300000023
indicating that the source domain data has a labeled sample,
Figure BDA0003712713300000024
representing the target domain data without annotated sample,
Figure BDA0003712713300000025
representing annotated samples with said source domain data
Figure BDA0003712713300000026
A corresponding label, and
Figure BDA0003712713300000027
c is the number of image categories;
b12 Unlabeled sample for the target domain data
Figure BDA0003712713300000028
Obtaining target domain weak enhancement samples by using inversion and shift data enhancement transformation
Figure BDA0003712713300000029
And the target domain data is not marked with samples
Figure BDA00037127133000000210
Method for generating severely distorted target domain strongly enhanced samples by utilizing random enhancement method
Figure BDA00037127133000000211
B13 Has labeled samples on the source domain data
Figure BDA00037127133000000212
And obtaining source domain weakly enhanced samples by using the flipping and shifting data enhancement transformation.
Further, the triple convolution neural network comprises a feature extraction part and a classification part, wherein the feature extraction part comprises a plurality of feature extraction networks, the structures and the parameters of the feature extraction networks are the same, the classification part comprises a plurality of classifiers, and the structures and the parameters of the classifiers are the same.
Further, the feature extraction network consists of a convolutional neural network pre-trained by an ImageNet (image network) dataset.
Further, the classifier is a layer of fully connected network for predicting the likelihood that the input data belongs to different classes.
Further, the output dimension of the classifier is consistent with the number of classes of the input data.
Further, the constructing step of the loss function includes:
b31 Constructing a classification loss, performing supervised learning on the source domain data, and constructing the classification loss of the source domain data based on cross entropy loss:
Figure BDA0003712713300000031
wherein L is CE For the classification loss function, p (x) s ) Representing annotated samples of said source domain data
Figure BDA0003712713300000032
The function p (|) represents the probability distribution predicted by the classifier, x s Labeling samples for the source domain data
Figure BDA0003712713300000033
The collection of (a) and (b),
Figure BDA0003712713300000034
representing annotated samples of said source domain data
Figure BDA0003712713300000035
Is the probability of a different class;
b32 Constructing migration loss L based on similarity of different feature distributions measured by maximum mean difference MMD
Figure BDA0003712713300000036
Wherein i represents the ith labeled sample of the source domain data
Figure BDA0003712713300000037
n s Representing annotated samples of said source domain data
Figure BDA0003712713300000038
Total number of (f) i s Indicating the ith labeled sample of the source domain data
Figure BDA0003712713300000039
Is characterized by n t Represents the total number of samples of the target domain data, j represents the jth target domain strong enhancement sample, phi is a mapping function, and samples of the source domain data are projected to a high-dimensional Hilbert space
Figure BDA0003712713300000041
In the Hilbert space
Figure BDA0003712713300000042
Calculating a sample mean of the source domain data and a sample mean of the target domain as a measure of domain difference;
b33 Constructing pseudo label consistency loss to obtain pseudo label classification loss L with consistency regular enhancement PCE
Figure BDA0003712713300000043
Wherein B represents the number of samples selected by one training, mu is the proportion of samples meeting the selection requirement of a set threshold value in the samples selected by one training, H function represents the cross entropy loss of two probability distributions, and x w Represent the target domain weakly enhanced samples
Figure BDA0003712713300000044
The collection of (a) and (b),
Figure BDA0003712713300000045
after screening for pseudo tags x w Probabilities of being predicted as different classes;
b34 Constructing minimum class confusion loss, determining the weight of a sample by using the value of the class probability distribution entropy of the target domain data, calculating a class confusion matrix according to the weighted sample of the target domain data, and combining the minimum class confusion loss to maximize the inter-class difference of the target domain data; entropy of the probability distribution
Figure BDA0003712713300000046
Comprises the following steps:
Figure BDA0003712713300000047
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003712713300000048
represents the jth sample in the target domain data,
Figure BDA0003712713300000049
representing the probability that the jth sample in the target domain data belongs to the c class, wherein the weight of the sample of the target domain data is defined as:
Figure BDA00037127133000000410
wherein, W j Representing a weight of a jth sample in said target domain data for scaling the weight, W j The corresponding diagonal matrix is W, and the confusion matrix M is defined based on the diagonal matrix W cc′ Comprises the following steps:
Figure BDA0003712713300000051
wherein the content of the first and second substances,
Figure BDA0003712713300000052
representing the probability of all samples in the samples selected by one training belonging to the c-th class, and minimizing class confusion loss L MCC Is defined as follows:
Figure BDA0003712713300000053
b35 The loss function constructed is:
L=L CE +L MMD +αL PCE +βL MCC
wherein, alpha and beta are parameters for balancing the optimization target of the triple convolutional neural network.
Further, the step of training the triplet convolutional neural network by the input data comprises:
c11 Weakly enhancing the target domain samples
Figure BDA0003712713300000054
The target domain strongly enhanced sample
Figure BDA0003712713300000055
And the source domain data has labeled samples
Figure BDA0003712713300000056
Respectively inputting the data into the corresponding feature extraction networks to obtain the target domain weakly enhanced sample features
Figure BDA0003712713300000057
Target domain strongly enhanced sample features
Figure BDA0003712713300000058
With source domain data having labeled sample features
Figure BDA0003712713300000059
C12 Strongly enhancing sample features with the target domain
Figure BDA00037127133000000510
With source domain data having labeled sample features
Figure BDA00037127133000000511
Performing distribution difference measurement to calculate the migration loss L MMD
C13 Weakly enhancing the target domain with sample features
Figure BDA00037127133000000512
The target domain strongly enhances sample features
Figure BDA00037127133000000513
And the source domain has labeled sample features
Figure BDA00037127133000000514
Inputting the classifier to convert into a target domain weakly enhanced sample conditional probability distribution
Figure BDA00037127133000000515
Target domain strongly enhanced sample conditional probability distribution
Figure BDA00037127133000000516
Conditional probability distribution of labeled samples with source domain data
Figure BDA00037127133000000517
C14 Conditional probability distribution of labeled samples to the source domain data
Figure BDA00037127133000000518
Based on the classification loss function L CE Calculating the classification loss of the source domain data;
c15 Weakly enhancing sample conditional probability distribution to the target domain
Figure BDA00037127133000000519
And (3) reserving the class label with the maximum class probability higher than the set probability distribution threshold value tau as a pseudo label:
Figure BDA00037127133000000520
wherein the content of the first and second substances,
Figure BDA0003712713300000061
representing a strongly enhanced sample conditional probability distribution of the target domain
Figure BDA0003712713300000062
The category where the maximum probability of the set threshold value screening condition is met;
c16 Pseudo-label generated with the target domain weakly enhanced samples
Figure BDA0003712713300000063
Adopting the pseudo label classification loss L as supervision information of corresponding target domain strong enhancement samples PCE Calculating the consistency loss of the pseudo label;
c17 For the target domain weakly enhanced sample conditional probability distribution, utilizing the minimized class confusion loss L MCC Calculating class confusion loss;
c18 Computing total loss of training, and adjusting network parameters of the feature extraction network by using a gradient descent algorithm.
Further, the remote sensing image feature vector is obtained through the trained feature extraction network.
Further, the step of obtaining the remote sensing image of the target domain comprises:
c21 Extracting the query feature vector image based on the trained feature extraction network;
c22 Calculating Euclidean distance between the query feature vector image and each remote sensing image feature vector one by one;
c23 According to the sequence of the Euclidean distances from small to large, sorting the remote sensing image feature vectors, and taking the remote sensing image feature vectors ranked in a set order as high-similarity images.
According to the technical scheme, in the remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning, input data are constructed firstly and comprise source domain data and target domain data, wherein the source domain data are labeled data, the target domain data are unlabeled, the constructed input data are correspondingly enhanced, then, the enhanced source domain data and the enhanced target domain data are input into a corresponding feature extraction network and a classifier, an output result is compared with the input data, pseudo tag consistency loss is established, accordingly, a loss function is obtained based on the pseudo tag consistency loss, network parameters of the feature extraction network can be adjusted based on the loss function, accordingly, the influence of distribution difference of the target domain and the source domain on the feature extraction network can be reduced, and the trained triple convolutional neural network has better retrieval accuracy and better retrieval effect when an unlabeled sample is retrieved.
Further advantages of the present invention, as well as the technical effects of preferred embodiments, are further described in the following detailed description.
Drawings
FIG. 1 is a flow chart of the remote sensing image domain adaptive retrieval method based on pseudo label consistency learning of the invention;
FIG. 2 is a schematic diagram illustrating the principle of the remote sensing image domain adaptive retrieval method based on pseudo label consistency learning according to the invention;
FIG. 3 is a schematic diagram of a training process of a triple convolution neural network in the remote sensing image domain adaptation retrieval method based on pseudo label consistency learning according to the invention;
FIG. 4 is a schematic diagram of a retrieval process in the remote sensing image domain adaptation retrieval method based on pseudo label consistency learning.
Detailed Description
The following describes in detail embodiments of the present invention with reference to the drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
As shown in fig. 1 and fig. 2, as an embodiment of the remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning provided by the present invention, the method includes the following steps:
a) Acquiring a remote sensing image;
b) Constructing input data, a triple convolution neural network and a loss function;
c) And performing optimization training on the triple convolutional neural network by using input data and combining with a loss function, extracting the remote sensing image feature vector of the remote sensing image in a target domain by using the trained triple convolutional neural network to form a feature library vector, extracting a query feature vector queried by a user, and comparing the query feature vector with the remote sensing image feature vector in the feature library vector to obtain the remote sensing image feature vector in the set similarity ranking.
Specifically, in an embodiment of the remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning provided by the present invention, the input data construction step includes:
b11 Construction of a structure containing n s Source domain data of individual samples
Figure BDA0003712713300000081
And comprises n t Target domain data of individual samples
Figure BDA0003712713300000082
Wherein the content of the first and second substances,
Figure BDA0003712713300000083
indicating that the source domain data has a labeled sample,
Figure BDA0003712713300000084
representing the target domain data without annotated sample,
Figure BDA0003712713300000085
representing annotated samples with source domain data
Figure BDA0003712713300000086
A corresponding label, and
Figure BDA0003712713300000087
c is the number of image categories;
b12 Unlabeled sample for target domain data
Figure BDA0003712713300000088
Obtaining target domain weak enhancement samples by using inversion and shift data enhancement transformation
Figure BDA0003712713300000089
And does not mark the target domain data
Figure BDA00037127133000000810
Method for generating severely distorted target domain strongly enhanced samples by utilizing random enhancement method
Figure BDA00037127133000000811
B13 Has annotated samples to the source domain data
Figure BDA00037127133000000812
And enhancing the transformation by using the flip and shift data to obtain the source domain weak sample.
Further, in an embodiment of the remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning provided by the present invention, as shown in fig. 3, the triple convolutional neural network includes a feature extraction part and a classification part, where the feature extraction part includes a plurality of feature extraction networks (for example, data samples extracted as needed include target domain weakly enhanced samples
Figure BDA00037127133000000813
Target domain data label-free sample
Figure BDA00037127133000000814
With source domain data labeled samples
Figure BDA00037127133000000815
Three, the feature extraction network is set to be three to practiceEach feature extraction network corresponds to different data samples), and the structure and parameters of each feature extraction network are the same; the classification section includes a plurality of classifiers (e.g., data samples classified as desired include target domain weakly enhanced samples
Figure BDA00037127133000000816
Target domain data label-free sample
Figure BDA00037127133000000817
Annotated sample with source domain data
Figure BDA00037127133000000818
Three classifiers are set to realize that each classifier corresponds to different data samples), and the structure and parameters of each classifier are the same, wherein the feature extraction network comprises a convolutional neural network ({ conv1, conv2_ x, conv3_ x, conv4_ x, conv _5x }) pre-trained by an ImageNet (image network) data set, and a bottleneck layer, and therefore, the structure of the feature extraction network is: { conv1, conv2_ x, conv3_ x, conv4_ x, conv _5x, bottoming layer }, whose output feature size is 256 dimensions; the classifier is a layer of fully-connected network and is used for predicting the possibility that input data belong to different categories, and the output dimension of the classifier is consistent with the number of the categories of the input data.
Further, in an embodiment of the remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning provided by the present invention, the step of constructing the loss function includes:
b31 For the source domain data, because the source domain data has corresponding labeling information, supervised learning can be carried out on the source domain data so as to ensure that the source domain data has labeled samples
Figure BDA0003712713300000091
Can be accurately identified, and particularly, classification loss of source domain data can be constructed based on cross entropy loss:
Figure BDA0003712713300000092
wherein L is CE For the classification loss function, p (x) s ) Representing annotated samples of source domain data
Figure BDA0003712713300000093
True probability distribution of (2), x s Annotating source domain data with samples
Figure BDA0003712713300000094
The collection of (a) and (b),
Figure BDA0003712713300000095
representing annotated samples of source domain data
Figure BDA0003712713300000096
Is the probability of a different class;
b32 Constructing migration loss L based on similarity of different feature distributions measured by maximum mean difference MMD
Figure BDA0003712713300000097
Wherein i represents the labeled sample of the ith source domain data
Figure BDA0003712713300000098
n s Annotated samples representing source domain data
Figure BDA0003712713300000099
Total number of (f) i s Indicating labeled sample of ith source domain data
Figure BDA00037127133000000910
Is characterized by n t Represents the total number of samples of the target domain data, j represents the jth target domain strong enhancement sample, phi is a mapping function, and projects the samples of the source domain data to a high-dimensional Hilbert space
Figure BDA00037127133000000911
In Hilbert space
Figure BDA00037127133000000912
Calculating a sample mean value of the source domain data and a sample mean value of the target domain to be used as a measure of the domain difference;
b33 Constructing pseudo-label consistency loss for pseudo-label consistency learning that constrains target domain samples, weakly enhancing samples using target domain
Figure BDA00037127133000000913
The generated pseudo label is used as a corresponding target domain strong enhancement sample
Figure BDA00037127133000000914
The cross entropy loss is calculated by the supervision information to obtain the consistency regular enhanced pseudo label classification loss L PCE
Figure BDA00037127133000000915
Wherein B represents the number of samples selected by one training, mu is the proportion of samples meeting the selection requirement of a set threshold in the samples selected by one training, H function represents the cross entropy loss of two probability distributions, and x w Representing target domain weakly enhanced samples
Figure BDA0003712713300000101
The collection of (a) and (b),
Figure BDA0003712713300000102
after screening for pseudo labels x w Probabilities of being predicted as different classes;
b34 Constructing minimum class confusion loss, determining the weight of a sample of the target domain data by using the value of the class probability distribution entropy of the target domain data, calculating a class confusion matrix according to the weighted sample of the target domain data, and combining the minimum class confusion loss to maximize the inter-class difference of the target domain data; utensil for cleaning buttockEntropy of body-to-ground, probability distribution
Figure BDA0003712713300000103
Comprises the following steps:
Figure BDA0003712713300000104
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003712713300000105
represents the jth sample in the target domain data,
Figure BDA0003712713300000106
representing the probability that the jth sample in the target domain data belongs to the c class, and the weight of the sample of the target domain data is defined as:
Figure BDA0003712713300000107
wherein, W j Representing the weight of the jth sample in the target domain data, for scaling the weight, W j The corresponding diagonal matrix is W, and a confusion-like matrix M is defined based on the diagonal matrix W cc′ Comprises the following steps:
Figure BDA0003712713300000108
wherein the content of the first and second substances,
Figure BDA0003712713300000109
representing the probability that all samples in the samples selected by one training belong to the c-th class, and minimizing the class confusion loss L MCC Is defined as:
Figure BDA00037127133000001010
b35 Constructed loss function is:
L=L CE +L MMD +αL PCE +βL MCC
wherein, alpha and beta are parameters for balancing the optimization target of the triple convolutional neural network.
Further, in an embodiment of the remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning provided by the present invention, the step of training the triplet convolutional neural network with input data includes:
c11 Weakly enhancing the target domain samples
Figure BDA0003712713300000111
Target domain strongly enhanced samples
Figure BDA0003712713300000112
With source domain data labeled samples
Figure BDA0003712713300000113
Respectively input into corresponding feature extraction networks to obtain the target domain weakly enhanced sample features
Figure BDA0003712713300000114
Target domain strongly enhanced sample features
Figure BDA0003712713300000115
With source domain data having annotated sample features
Figure BDA0003712713300000116
C12 Exploit the strong enhancement of sample features in the target domain
Figure BDA0003712713300000117
With source domain data having annotated sample features
Figure BDA0003712713300000118
Performing distribution difference measurement to calculate migration loss L MMD
C13 Weakly enhancing the target domain sample features
Figure BDA0003712713300000119
Strongly enhanced sample features in target domain
Figure BDA00037127133000001110
With source domain data having annotated sample features
Figure BDA00037127133000001111
Inputting corresponding classifier to convert corresponding into target domain weakly enhanced sample conditional probability distribution
Figure BDA00037127133000001112
Target domain strongly enhanced sample conditional probability distribution
Figure BDA00037127133000001113
Conditional probability distribution of labeled samples with source domain data
Figure BDA00037127133000001114
C14 Has a conditional probability distribution of labeled samples over source domain data
Figure BDA00037127133000001115
Based on the classification loss function L CE Calculating the classification loss of the source domain data;
c15 Weakly enhancing sample conditional probability distribution to target domain
Figure BDA00037127133000001116
And (3) reserving the class label with the maximum class probability higher than the set probability distribution threshold value tau as a pseudo label:
Figure BDA00037127133000001117
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00037127133000001118
representing strongly enhanced sample conditional probability distributions for a target domain
Figure BDA00037127133000001119
The category where the maximum probability of the set threshold value screening condition is met;
c16 Pseudo-label generated with target domain weakly enhanced samples
Figure BDA00037127133000001120
Adopting pseudo label classification loss L as supervision information of corresponding target domain strong enhancement sample PCE Calculating the consistency loss of the pseudo label;
c17 For weakly enhanced sample conditional probability distribution in the target domain, with minimized class confusion loss L MCC Calculating class confusion loss;
c18 Computing total loss of training, and adjusting network parameters of the feature extraction network by using a gradient descent algorithm.
And after the training of the feature extraction network is completed, extracting the feature vector of the remote sensing image by the feature extraction network trained by the steps.
Further, in an embodiment of the remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning provided by the present invention, as shown in fig. 4, the step of obtaining the remote sensing image of the target domain is as follows:
c21 Extracting a query feature vector image based on the trained feature extraction network;
c22 Calculating the Euclidean distance between the query feature vector image and the feature vector of each remote sensing image one by one;
c23 The remote sensing image feature vectors are sorted according to the sequence of Euclidean distances from small to large, and the remote sensing image feature vectors with ranks in a set order (such as the first K, the specific value of K can be manually specified, for example, the first 4) are taken as high-similarity images.
Further, in an embodiment of the remote sensing image Domain Adaptive retrieval method based on pseudo tag consistency learning provided by the present invention, the construction of the input data and the training of the triple convolutional neural Network are implemented based on a pytorreh library of Python language, and in addition, simulation experiments of retrieval systems such as ADDA (Adaptive discrete Domain Adaptation), AFN (Adaptive Feature Norm), BSP (Batch singular value constraint), CDAN (Conditional Adaptive Domain Adaptation) and DAN (Deep Adaptation Network) are also performed to compare with the remote sensing image Domain Adaptive retrieval method based on pseudo tag consistency learning of the present invention; the invention adopts an Average Normalized Modified Retrieval Rank (ANMRR), an average retrieval precision (mAP) and PK (K is the retrieval precision of the previous K images) to evaluate the result, wherein the higher the average retrieval precision (mAP) and the retrieval precision PK values of the previous K images are, the better the retrieval performance is, the smaller the ANMRR value is, the better the comparison result is shown in the following table 1:
Figure BDA0003712713300000121
Figure BDA0003712713300000131
TABLE 1
The results in the table 1 show that the remote sensing image domain adaptive retrieval method based on pseudo label consistency learning obtains the highest retrieval precision, and compared with a comparison method, the average retrieval precision mAP of the method is improved by 20.04-28.96%, and the average normalized correction retrieval rank ANMRR is also improved. In addition, the retrieval precision of P5-P100 of the method is superior to that of a comparison method, and in conclusion, the remote sensing image domain adaptive retrieval method based on pseudo label consistency learning can improve the retrieval capability of the query image of the target domain under the condition that the target domain is not labeled.
According to the technical scheme, in the remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning, input data are constructed firstly and comprise source domain data and target domain data, wherein the source domain data are labeled data, the target domain data are unlabeled, the constructed input data are correspondingly enhanced, then, the enhanced source domain data and the enhanced target domain data are input into a corresponding feature extraction network and a classifier, an output result is compared with the input data, pseudo tag consistency loss is established, accordingly, a loss function is obtained based on the pseudo tag consistency loss, network parameters of the feature extraction network can be adjusted based on the loss function, accordingly, the influence of distribution difference of the target domain and the source domain on the feature extraction network can be reduced, and the trained triple convolutional neural network has better retrieval accuracy and better retrieval effect when an unlabeled sample is retrieved.
The preferred embodiments of the present invention have been described in detail above with reference to the accompanying drawings, but the present invention is not limited thereto. Within the scope of the technical idea of the invention, numerous simple modifications can be made to the technical solution of the invention, including any suitable combination of specific features, and in order to avoid unnecessary repetition, the invention will not be described in detail in relation to the various possible combinations. Such simple modifications and combinations should be considered within the scope of the present disclosure as well.

Claims (10)

1. The remote sensing image domain adaptive retrieval method based on pseudo label consistency learning is characterized by comprising the following steps of:
a) Acquiring a remote sensing image;
b) Constructing input data, a triple convolution neural network and a loss function;
c) And performing optimization training on the triple convolutional neural network by using the input data and combining the loss function, extracting the remote sensing image feature vector of the remote sensing image in the target domain by using the trained triple convolutional neural network to form a feature library vector, extracting a query feature vector queried by a user, and comparing the query feature vector with the remote sensing image feature vector in the feature library vector to obtain the remote sensing image feature vector in a set similarity rank.
2. The remote sensing image domain adaptation retrieval method based on pseudo-label consistency learning of claim 1, wherein the construction step of the input data comprises the following steps:
b11 Construction of a structure containing n s Source domain data of individual samples
Figure FDA0003712713290000011
And comprises n t Target domain data of individual samples
Figure FDA0003712713290000012
Wherein the content of the first and second substances,
Figure FDA0003712713290000013
indicating that the source domain data has a labeled sample,
Figure FDA0003712713290000014
representing the target domain data without annotated sample,
Figure FDA0003712713290000015
representing annotated samples with said source domain data
Figure FDA0003712713290000016
A corresponding label, and
Figure FDA0003712713290000017
c is the number of image categories;
b12 Unlabeled sample for the target domain data
Figure FDA0003712713290000018
Obtaining target domain weak enhancement samples by using inversion and shift data enhancement transformation
Figure FDA0003712713290000019
And the target domain data is not marked with samples
Figure FDA00037127132900000110
Method for generating severely distorted target domain strongly enhanced samples by utilizing random enhancement method
Figure FDA00037127132900000111
B13 Has annotated samples to the source domain data
Figure FDA00037127132900000112
And obtaining source domain weakly enhanced samples by using the flipping and shifting data enhancement transformation.
3. The remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning of claim 2, wherein the triple convolution neural network comprises a feature extraction part and a classification part, wherein the feature extraction part comprises a plurality of feature extraction networks, the structures and the parameters of the feature extraction networks are the same, the classification part comprises a plurality of classifiers, and the structures and the parameters of the classifiers are the same.
4. The remote sensing image domain adaptation retrieval method based on pseudo-label consistency learning of claim 3, wherein the feature extraction network is composed of a convolutional neural network pre-trained by ImageNet data sets.
5. The remote sensing image domain adaptation retrieval method based on pseudo-label consistency learning of claim 3, wherein the classifier is a layer of fully connected network for predicting the possibility that the input data belongs to different categories.
6. The remote sensing image domain adaptation retrieval method based on pseudo-label consistency learning of claim 5, wherein the output dimension of the classifier is consistent with the number of categories of the input data.
7. The remote sensing image domain adaptation retrieval method based on pseudo-label consistency learning of claim 6, wherein the construction step of the loss function comprises the following steps:
b31 Constructing a classification loss, performing supervised learning on the source domain data, and constructing the classification loss of the source domain data based on cross entropy loss:
Figure FDA0003712713290000021
wherein L is CE For the classification loss function, p (x) s ) Representing annotated samples of said source domain data
Figure FDA0003712713290000022
The function p (|) represents the probability distribution predicted by the classifier, x s Labeling samples for the source domain data
Figure FDA0003712713290000023
The collection of (a) and (b),
Figure FDA0003712713290000024
representing annotated samples of said source domain data
Figure FDA0003712713290000025
Is the probability of a different class;
b32 Constructing migration loss L based on similarity of different feature distributions measured by maximum mean difference MMD
Figure FDA0003712713290000031
Wherein i represents the ith labeled sample of the source domain data
Figure FDA0003712713290000032
n s Representing annotated samples of said source domain data
Figure FDA0003712713290000033
Total number of (c), f i s Indicating the ith labeled sample of the source domain data
Figure FDA0003712713290000034
Is characterized by n t Represents the total number of samples of the target domain data, j represents the jth target domain strong enhancement sample, phi is a mapping function, and samples of the source domain data are projected to a high-dimensional Hilbert space
Figure FDA0003712713290000035
In the Hilbert space
Figure FDA0003712713290000036
Calculating a sample mean of the source domain data and a sample mean of the target domain as a measure of domain difference;
b33 Constructing pseudo label consistency loss to obtain pseudo label classification loss L with consistency regular enhancement PCE
Figure FDA0003712713290000037
Wherein B represents the number of samples selected by one training, mu is the proportion of samples meeting the selection requirement of a set threshold in the samples selected by one training, H function represents the cross entropy loss of two probability distributions, and x w Represent the target domain weakly enhanced samples
Figure FDA0003712713290000038
The collection of (a) and (b),
Figure FDA0003712713290000039
after screening for pseudo tags x w Predicated on different classes of generalitiesThe ratio;
b34 Constructing minimum class confusion loss, determining the weight of a sample by using the value of the class probability distribution entropy of the target domain data, calculating a class confusion matrix according to the weighted sample of the target domain data, and combining the minimum class confusion loss to maximize the inter-class difference of the target domain data; the probability distribution entropy
Figure FDA00037127132900000310
Comprises the following steps:
Figure FDA00037127132900000311
wherein the content of the first and second substances,
Figure FDA00037127132900000312
represents the jth sample in the target domain data,
Figure FDA00037127132900000313
representing the probability that the jth sample in the target domain data belongs to the c class, wherein the weight of the sample of the target domain data is defined as:
Figure FDA0003712713290000041
wherein, W j Representing a weight of a jth sample in said target domain data for scaling the weight, W j The corresponding diagonal matrix is W, and the confusion matrix M is defined based on the diagonal matrix W cc′ Comprises the following steps:
Figure FDA0003712713290000042
wherein the content of the first and second substances,
Figure FDA0003712713290000043
representing the probability of all samples in the samples selected by one training belonging to the c-th class, and minimizing class confusion loss L MCC Is defined as:
Figure FDA0003712713290000044
b35 The loss function constructed is:
L=L CE +L MMD +αL PCE +βL MCC
wherein, alpha and beta are parameters for balancing the optimization target of the triple convolutional neural network.
8. The remote sensing image domain adaptation retrieval method based on pseudo-label consistency learning of claim 7, wherein the step of training the triplet convolutional neural network by the input data comprises:
c11 Weakly enhancing the target domain samples
Figure FDA0003712713290000045
The target domain strongly enhanced sample
Figure FDA0003712713290000046
And said source domain data has labeled samples
Figure FDA0003712713290000047
Respectively inputting the data into the corresponding feature extraction networks to obtain the target domain weakly enhanced sample features
Figure FDA0003712713290000048
Target domain strongly enhanced sample features
Figure FDA0003712713290000049
With source domain data having annotated sample features
Figure FDA00037127132900000410
C12 Strongly enhancing sample features with the target domain
Figure FDA00037127132900000411
With source domain data having annotated sample features
Figure FDA00037127132900000412
Performing distribution difference measurement to calculate the migration loss L MMD
C13 Weakly enhancing the target domain sample features
Figure FDA0003712713290000051
The target domain strongly enhances sample features
Figure FDA0003712713290000052
And the source domain data has labeled sample features
Figure FDA0003712713290000053
Inputting the classifier to convert into a target domain weakly enhanced sample conditional probability distribution
Figure FDA0003712713290000054
Target domain strongly enhanced sample conditional probability distribution
Figure FDA0003712713290000055
Conditional probability distribution of labeled samples with source domain data
Figure FDA0003712713290000056
C14 Conditional probability distribution of labeled samples to the source domain data
Figure FDA0003712713290000057
Based on the classification loss function L CE Calculating the classification loss of the source domain data;
c15 Weakly enhancing sample conditional probability distribution to the target domain
Figure FDA0003712713290000058
And (3) reserving the class label with the maximum class probability higher than the set probability distribution threshold value tau as a pseudo label:
Figure FDA0003712713290000059
wherein the content of the first and second substances,
Figure FDA00037127132900000510
representing a conditional probability distribution of strongly enhanced samples of the target domain
Figure FDA00037127132900000511
The category where the maximum probability of the set threshold value screening condition is met;
c16 Pseudo-label generated with the target domain weakly enhanced samples
Figure FDA00037127132900000512
Adopting the pseudo label classification loss L as supervision information of corresponding target domain strong enhancement samples PCE Calculating the consistency loss of the pseudo label;
c17 For the target domain weakly enhanced sample conditional probability distribution, utilizing the minimized class confusion loss L MCC Calculating class confusion loss;
c18 Computing total loss of training, and adjusting network parameters of the feature extraction network by using a gradient descent algorithm.
9. The remote sensing image domain adaptive retrieval method based on pseudo-label consistency learning of claim 8, wherein the remote sensing image feature vector is obtained through the trained feature extraction network.
10. The remote sensing image domain adaptive retrieval method based on pseudo label consistency learning of claim 9 is characterized in that the step of obtaining the remote sensing image of the target domain is as follows:
c21 Extracting the query feature vector image based on the trained feature extraction network;
c22 Calculating Euclidean distance between the query feature vector image and each remote sensing image feature vector one by one;
c23 According to the sequence of the Euclidean distances from small to large, sorting the remote sensing image feature vectors, and taking the remote sensing image feature vectors ranked in a set order as high-similarity images.
CN202210729817.7A 2022-06-24 2022-06-24 Remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning Active CN115292532B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210729817.7A CN115292532B (en) 2022-06-24 2022-06-24 Remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210729817.7A CN115292532B (en) 2022-06-24 2022-06-24 Remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning

Publications (2)

Publication Number Publication Date
CN115292532A true CN115292532A (en) 2022-11-04
CN115292532B CN115292532B (en) 2024-03-15

Family

ID=83821188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210729817.7A Active CN115292532B (en) 2022-06-24 2022-06-24 Remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning

Country Status (1)

Country Link
CN (1) CN115292532B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116188995A (en) * 2023-04-13 2023-05-30 国家基础地理信息中心 Remote sensing image feature extraction model training method, retrieval method and device
CN117456312A (en) * 2023-12-22 2024-01-26 华侨大学 Simulation anti-fouling pseudo tag enhancement method for unsupervised image retrieval

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106980641A (en) * 2017-02-09 2017-07-25 上海交通大学 The quick picture retrieval system of unsupervised Hash and method based on convolutional neural networks
CN112131967A (en) * 2020-09-01 2020-12-25 河海大学 Remote sensing scene classification method based on multi-classifier anti-transfer learning
CN112699892A (en) * 2021-01-08 2021-04-23 北京工业大学 Unsupervised field self-adaptive semantic segmentation method
CN113190699A (en) * 2021-05-14 2021-07-30 华中科技大学 Remote sensing image retrieval method and device based on category-level semantic hash
US20210390686A1 (en) * 2020-06-15 2021-12-16 Dalian University Of Technology Unsupervised content-preserved domain adaptation method for multiple ct lung texture recognition
CN113889228A (en) * 2021-09-22 2022-01-04 武汉理工大学 Semantic enhanced Hash medical image retrieval method based on mixed attention
CN114549909A (en) * 2022-03-03 2022-05-27 重庆邮电大学 Pseudo label remote sensing image scene classification method based on self-adaptive threshold

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106980641A (en) * 2017-02-09 2017-07-25 上海交通大学 The quick picture retrieval system of unsupervised Hash and method based on convolutional neural networks
US20210390686A1 (en) * 2020-06-15 2021-12-16 Dalian University Of Technology Unsupervised content-preserved domain adaptation method for multiple ct lung texture recognition
CN112131967A (en) * 2020-09-01 2020-12-25 河海大学 Remote sensing scene classification method based on multi-classifier anti-transfer learning
CN112699892A (en) * 2021-01-08 2021-04-23 北京工业大学 Unsupervised field self-adaptive semantic segmentation method
CN113190699A (en) * 2021-05-14 2021-07-30 华中科技大学 Remote sensing image retrieval method and device based on category-level semantic hash
CN113889228A (en) * 2021-09-22 2022-01-04 武汉理工大学 Semantic enhanced Hash medical image retrieval method based on mixed attention
CN114549909A (en) * 2022-03-03 2022-05-27 重庆邮电大学 Pseudo label remote sensing image scene classification method based on self-adaptive threshold

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116188995A (en) * 2023-04-13 2023-05-30 国家基础地理信息中心 Remote sensing image feature extraction model training method, retrieval method and device
CN116188995B (en) * 2023-04-13 2023-08-15 国家基础地理信息中心 Remote sensing image feature extraction model training method, retrieval method and device
CN117456312A (en) * 2023-12-22 2024-01-26 华侨大学 Simulation anti-fouling pseudo tag enhancement method for unsupervised image retrieval
CN117456312B (en) * 2023-12-22 2024-03-12 华侨大学 Simulation anti-fouling pseudo tag enhancement method for unsupervised image retrieval

Also Published As

Publication number Publication date
CN115292532B (en) 2024-03-15

Similar Documents

Publication Publication Date Title
CN111191732B (en) Target detection method based on full-automatic learning
CN110598029B (en) Fine-grained image classification method based on attention transfer mechanism
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN106909924B (en) Remote sensing image rapid retrieval method based on depth significance
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN113190699B (en) Remote sensing image retrieval method and device based on category-level semantic hash
Lin et al. RSCM: Region selection and concurrency model for multi-class weather recognition
CN110909820A (en) Image classification method and system based on self-supervision learning
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN115292532A (en) Remote sensing image domain adaptive retrieval method based on pseudo label consistency learning
CN112668579A (en) Weak supervision semantic segmentation method based on self-adaptive affinity and class distribution
CN109948742A (en) Handwritten form picture classification method based on quantum nerve network
CN113032613B (en) Three-dimensional model retrieval method based on interactive attention convolution neural network
CN115471739A (en) Cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning
CN112434628A (en) Small sample polarization SAR image classification method based on active learning and collaborative representation
CN113806580B (en) Cross-modal hash retrieval method based on hierarchical semantic structure
CN112597324A (en) Image hash index construction method, system and equipment based on correlation filtering
CN114863091A (en) Target detection training method based on pseudo label
CN115457332A (en) Image multi-label classification method based on graph convolution neural network and class activation mapping
CN114579794A (en) Multi-scale fusion landmark image retrieval method and system based on feature consistency suggestion
CN114357221A (en) Self-supervision active learning method based on image classification
CN114329031A (en) Fine-grained bird image retrieval method based on graph neural network and deep hash
CN113065520A (en) Multi-modal data-oriented remote sensing image classification method
WO2024082374A1 (en) Few-shot radar target recognition method based on hierarchical meta transfer
CN111079840A (en) Complete image semantic annotation method based on convolutional neural network and concept lattice

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant