CN114998602A - Domain adaptive learning method and system based on low confidence sample contrast loss - Google Patents

Domain adaptive learning method and system based on low confidence sample contrast loss Download PDF

Info

Publication number
CN114998602A
CN114998602A CN202210942337.9A CN202210942337A CN114998602A CN 114998602 A CN114998602 A CN 114998602A CN 202210942337 A CN202210942337 A CN 202210942337A CN 114998602 A CN114998602 A CN 114998602A
Authority
CN
China
Prior art keywords
image
enhanced view
domain
features
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210942337.9A
Other languages
Chinese (zh)
Other versions
CN114998602B (en
Inventor
王子磊
张燚鑫
贺伟男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202210942337.9A priority Critical patent/CN114998602B/en
Publication of CN114998602A publication Critical patent/CN114998602A/en
Application granted granted Critical
Publication of CN114998602B publication Critical patent/CN114998602B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/242Aligning, centring, orientation detection or correction of the image by image rotation, e.g. by 90 degrees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Abstract

The invention discloses a domain adaptive learning method and system based on low-confidence sample contrast loss, which uses a contrast learning method, fully utilizes a target domain low-confidence sample on the basis of the original domain adaptive method utilizing a target domain high-confidence sample, and prevents an image classification model from causing suboptimal domain migration effect due to deviation to a sample close to a source domain in the target domain; in contrast learning, original image features are represented again, and semantic information specific to tasks is encoded better; in addition, cross-domain mixing is used for low-confidence samples, the low-confidence samples are dominant in the low-confidence samples, the domain difference is reduced, and the image classification model can better learn the invariant features of the domain. In general, the method and the device utilize low-confidence samples, and improve the accuracy of unsupervised domain adaptation and semi-supervised domain adaptation image classification.

Description

Domain adaptive learning method and system based on low confidence sample contrast loss
Technical Field
The invention relates to the field of image classification, in particular to a domain adaptive learning method and system based on low confidence sample contrast loss.
Background
In recent years, the use of deep neural networks to deal with various types of machine learning problems has been highly effective, however its superior performance has largely relied on large high quality labeled data sets. The high time and labor costs make manually labeling datasets impractical. Traditional deep learning methods also do not generalize well to new data sets due to domain bias issues. In this regard, domain adaptation utilizes knowledge learned over a source domain with a large number of labeled samples to assist in model learning over another target domain that is associated with the source domain but lacks labeling, which can save labeling costs by reducing domain skews. The domain adaptation can be divided into unsupervised domain adaptation and semi-supervised domain adaptation according to whether the target domain sample has the label.
A common approach to solving domain bias is to make the model learning domain invariant features. Existing methods for domain adaptation are typically based on inter-domain difference metrics, or on countermeasures. In the chinese patent application CN113011456A, the application "unsupervised domain adaptation method based on class adaptive model for image classification", a domain transferable encoder is established by a self-attention module and a cross-attention module, so as to achieve intra-domain alignment and inter-domain alignment; a class adaptive decoder is built to reduce domain differences through class prototype learning and alignment. In chinese patent application CN113011523A, an unsupervised depth domain adaptation method based on distributed countermeasure, feature distribution matching is performed on the fully connected layers of a classifier, MK-MMD (multi-core maximum mean difference) is used to measure the feature distribution difference between the domains, and two fully connected networks are built after the convolutional layers to serve as domain discriminators to perform domain countermeasure to reduce the domain difference. In the chinese patent application CN113673555A, the publication number is CN113673555A, the method uses a neural network model to extract the features of pictures in a data set, uses a clustering algorithm to assist a memory to store the features of a source domain and a target domain class by class, trains a neural network, and uses the similarity of the distribution of memories of the source domain and the target domain as a condition constraint neural network. In the chinese patent application CN113283489A, a classification method for semi-supervised domain adaptive learning based on joint distribution matching, the difference between the source object sample data and the target object sample data distribution is measured by a preset algorithm based on a kernel method, and the joint distribution of the target domain and the source domain is drawn. In the chinese patent application CN113378632A, an unsupervised domain adaptive pedestrian re-identification algorithm based on pseudo label optimization, an auxiliary classifier structure is used to calculate the KL divergence (relative entropy) value between the class prediction vector output by the auxiliary classifier structure and the class prediction vector output by the main classifier structure, so as to obtain a more reliable pseudo label. In the chinese patent application CN113610105A, the application "unsupervised domain adaptive image classification method based on dynamic weighted learning and meta learning", the network model parameters are optimized by weighting the samples, dynamically adjusting the weights of domain alignment loss and classification loss, and calculating domain alignment loss and classification loss through meta learning, so as to promote the optimization consistency between the domain alignment task and the classification task.
However, the existing domain adaptation methods: on the one hand, the inherent structure of the tag-free target domain is not explored; on the other hand, some criteria are used for screening out high-confidence samples, meanwhile, low-confidence samples are completely ignored, and the neglected low-confidence samples cannot reflect the structure of real target domain data, so that the image classification model is biased to the high-confidence samples, and the classification accuracy of the image classification model after domain adaptation learning is poor.
Disclosure of Invention
The invention aims to provide a domain adaptive learning method and system based on low confidence sample contrast loss, which are used for performing contrast learning by using low confidence samples and are beneficial to improving the classification accuracy of an image classification model after domain adaptive learning.
The purpose of the invention is realized by the following technical scheme:
a domain adaptive learning method based on low confidence sample contrast loss comprises the following steps:
screening out a low-confidence sample set from the target domain image set according to a set threshold;
for each low confidence sample image, obtaining two different enhanced view images, namely a first enhanced view image and a second enhanced view image, by using a data enhancement method, randomly selecting a source domain sample image from a source domain image set, and obtaining two different enhanced view images, namely a third enhanced view image and a fourth enhanced view image, by using the data enhancement method;
mixing the first enhanced view image and the third enhanced view image to form a query image, inputting the query image into a first image classification model, and performing image feature extraction and re-representation through the first image classification model to obtain a first re-representation feature; inputting the second enhanced view image and the fourth enhanced view image into a second image classification model respectively, and extracting and re-representing image features through the second image classification model respectively to obtain corresponding re-represented features; blending the second enhanced view image with the corresponding re-representation features of the fourth enhanced view image to form blended re-representation features;
and taking the first re-representation feature as a query feature, taking all the rest re-representation features as comparison features, constructing a comparison loss by using the difference between the query feature and each comparison feature, and constructing a total loss function by combining the basic loss of the first image classification model to train the first image classification model.
A low confidence sample contrast loss based domain adaptive learning system, comprising:
the low confidence sample set generating unit is used for screening out a low confidence sample set from the target domain image set according to a set threshold;
an enhanced view image generation unit, configured to, for each low confidence sample image, obtain two different enhanced view images, referred to as a first enhanced view image and a second enhanced view image, using a data enhancement method, randomly select a source domain sample image from a source domain image set, and obtain two different enhanced view images, referred to as a third enhanced view image and a fourth enhanced view image, using the data enhancement method;
the re-representation feature acquisition unit is used for mixing the first enhanced view image and the third enhanced view image to form a query image, inputting the query image into a first image classification model, extracting image features through the first image classification model, and re-representing to obtain a first re-representation feature; inputting the second enhanced view image and the fourth enhanced view image into a second image classification model, and respectively performing image feature extraction and re-representation through the second image classification model to obtain corresponding re-representation features; blending the first re-representation feature with a re-representation feature corresponding to the fourth enhanced view image to form a blended re-representation feature;
and the total loss function construction and model training unit is used for constructing a comparison loss by using the difference between the query feature and each comparison feature and training the first image classification model by combining the basic loss construction total loss function of the first image classification model by using the first re-representation feature as a query feature and all the rest re-representation features as comparison features.
A processing device, comprising: one or more processors; a memory for storing one or more programs;
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the aforementioned methods.
A readable storage medium, storing a computer program which, when executed by a processor, implements the aforementioned method.
The technical scheme provided by the invention can show that: (1) by using a contrast learning method, on the basis of an original domain adaptation method using a target domain high-confidence sample, a target domain low-confidence sample is fully used, and suboptimal domain migration effect of an image classification model due to deviation towards a sample close to a source domain in the target domain is prevented; (2) in contrast learning, original image features are represented again, and task-specific semantic information is encoded better; (3) cross-domain mixing is used for low-confidence samples, the low-confidence samples are led to dominate in the low-confidence samples, the field difference is reduced, and the image classification model can better learn the field invariant features. In general, the method and the device utilize low-confidence samples, and improve the accuracy of unsupervised domain adaptation and semi-supervised domain adaptation image classification.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic diagram of a domain adaptive learning method based on low confidence sample contrast loss according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the average similarity between the same type of samples and different types of samples according to an embodiment of the present invention;
FIG. 3 is a flow chart of the calculation of contrast loss according to the embodiment of the present invention;
FIG. 4 is a process diagram of re-representation of features provided by an embodiment of the present invention;
fig. 5 is a flowchart of calculating cross entropy loss of KLD regularization terms and high confidence samples according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating a domain adaptive learning system based on low confidence sample contrast loss according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a processing apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
First, terms that may be used herein are explained as follows: the terms "comprising," "including," "containing," "having," or other similar terms of meaning should be construed as non-exclusive inclusions. For example: including a feature (e.g., material, component, ingredient, carrier, formulation, material, dimension, part, component, mechanism, device, process, procedure, method, reaction condition, processing condition, parameter, algorithm, signal, data, product, or article of manufacture), is to be construed as including not only the particular feature explicitly listed but also other features not explicitly listed as such which are known in the art.
The invention discloses a domain adaptive learning scheme utilizing low-confidence sample contrast loss, which aims to solve the problem of limited accuracy of the existing domain adaptive image classification method and is applicable to unsupervised domain adaptation (namely training data in a target domain are unlabeled) and semi-supervised domain adaptation (namely the training data in the target domain comprise a small part of labeled data and a large part of unlabeled data). The following describes a domain adaptive learning scheme based on low confidence sample contrast loss according to the present invention in detail. Details which are not described in detail in the embodiments of the invention belong to the prior art which is known to a person skilled in the art. Those not specifically mentioned in the examples of the present invention were carried out according to the conventional conditions in the art or conditions suggested by the manufacturer.
Example one
The embodiment of the invention provides a domain adaptive learning method based on low confidence sample contrast loss, which mainly comprises the following steps as shown in figure 1:
step 1, screening a low-confidence sample set from a target domain image set according to a set threshold value.
In the embodiment of the invention, the sample image with low confidence coefficient is used for contrast learning, and the low confidence coefficient is determined according to whether the maximum value of the output probability of the sample image is smaller than a given threshold value
Figure 635388DEST_PATH_IMAGE001
Is determined if less than
Figure 286949DEST_PATH_IMAGE001
The low confidence sample image is assigned, and in particular, the output probability of the second image classification model is used.
The invention discovers that the average similarity between low confidence sample images belonging to the same class is found through earlier stage experiments
Figure 660161DEST_PATH_IMAGE002
Average similarity between low confidence sample images that are low and belong to different classes
Figure 524212DEST_PATH_IMAGE003
Higher as shown in fig. 2. The two classes of average similarity are defined as:
Figure 135322DEST_PATH_IMAGE004
Figure 551260DEST_PATH_IMAGE005
wherein the content of the first and second substances,
Figure 287135DEST_PATH_IMAGE006
and
Figure 846554DEST_PATH_IMAGE007
representing two low confidence sample images screened from the target domain image set,
Figure 984275DEST_PATH_IMAGE008
and
Figure 774376DEST_PATH_IMAGE009
class labels representing two low confidence sample images,
Figure 122181DEST_PATH_IMAGE010
representing images
Figure 328034DEST_PATH_IMAGE006
And with
Figure 710474DEST_PATH_IMAGE007
Belong to the same category of the same group,
Figure 609160DEST_PATH_IMAGE011
representing images
Figure 146058DEST_PATH_IMAGE006
And
Figure 14657DEST_PATH_IMAGE007
belonging to a different category of the plant, the plant is,
Figure 392549DEST_PATH_IMAGE012
and
Figure 524453DEST_PATH_IMAGE013
image features representing two low confidence sample images;
Figure 846850DEST_PATH_IMAGE014
is a mathematical expectation symbol, and T is a transposed symbol;
Figure 394506DEST_PATH_IMAGE015
can be
Figure 253003DEST_PATH_IMAGE016
(unsupervised target Domain),
Figure 493491DEST_PATH_IMAGE017
And
Figure 240867DEST_PATH_IMAGE018
(the superscript indicates low confidence and high confidence). From this result, it is reasonable to use the contrast loss only for low confidence samples, since this reduces the adverse effect of the same class of samples being considered negative in contrast loss.
And 2, for the low-confidence sample image of each target domain, obtaining two different enhanced view images called a first enhanced view image and a second enhanced view image by using a data enhancement method, randomly selecting a source domain sample image from a source domain image set, and obtaining two different enhanced view images called a third enhanced view image and a fourth enhanced view image by using the data enhancement method.
The method mainly comprises the steps of respectively processing a low confidence sample image (belonging to a target domain image) and a source domain sample image to obtain different enhanced views which are used as basic data for low confidence sample contrast learning; the related data enhancement method can refer to the conventional technology, and the invention is not described in detail.
Step 3, mixing the first enhanced view image and the third enhanced view image to form a query image, inputting the query image into a first image classification model, and performing image feature extraction and re-representation through the first image classification model to obtain a first re-representation feature; inputting the second enhanced view image and the fourth enhanced view image into a second image classification model respectively, and extracting and re-representing image features through the second image classification model respectively to obtain corresponding re-represented features; blending the second enhanced view image with the corresponding re-representation feature of the fourth enhanced view image to form a blended re-representation feature.
This step is mainly to obtain re-representation characteristics of each image.
Because the existing contrast learning process only considers the feature space structure of the target domain and ignores the domain difference, the embodiment of the invention provides cross-domain hybrid contrast learning, which is used for learning the invariant features of the domain, namely mixing the first enhanced view image and the third enhanced view image to be used as a query image; moreover, the query image, the second enhanced view image and the fourth enhanced view image are processed through the two image classification models respectively to obtain corresponding re-representation characteristics, so that task-specific semantic information can be better coded; furthermore, the two re-represented features in the second image classification model are blended.
And 4, taking the first re-representation feature as a query feature, taking all the rest re-representation features as comparison features, constructing a comparison loss by using the difference between the query feature and each comparison feature, and constructing a total loss function by combining the basic loss of the first image classification model to train the first image classification model.
The step is based on the processing structure of the step to construct cross-domain mixed contrast loss, and the training of the first image classification model is carried out by combining a basic loss function.
In the embodiment of the invention, the first image classification model and the second image classification model have the same structure and respectively comprise a feature extractor, a re-representation module and a classifier. The model training in the embodiment of the invention mainly updates the parameters of the first image classification model, and then generates the parameters of the second image classification model by using Exponential Moving Average (EMA) according to the parameters of the first image classification model. The implementation of the feature extractor and the classifier can refer to the conventional technology, and the present invention is not described in detail.
In order to more clearly show the technical solutions and the technical effects provided by the present invention, a domain adaptive learning method based on low confidence sample contrast loss provided by the embodiments of the present invention is described in detail below with specific embodiments. Since the domain adaptation learning population includes two-part losses, i.e., the contrast loss and the fundamental loss, as described above, the calculation method of the two-part loss is mainly described, and then the total loss function is described. It should be noted that the specific model structures, frame formats, specific parameter values, etc. referred to in the following description are exemplary and not limiting.
First, loss of contrast.
1. And (5) introducing a model structure.
As shown in fig. 3, the main flow of contrast learning with low confidence samples is shown. The left part is the relevant image, the image content is only an example; the image classification method adopts a teacher-student architecture, namely, the first image classification model is equivalent to a student model, and the second image classification model is equivalent to a teacher model. As described earlier, the structures of the two image classification models are identical, but the parameters of the teacher model are generated by the Exponential Moving Average (EMA) of the student model, and the regression coefficient may be set to 0.99, for example. In addition, the input of the classifier in the image classification model is the original features (the calculation method will be described later), and the output is mainly used for calculating the loss of basis, so the relevant classifier is not shown in fig. 3.
2. And comparing the learning process.
The invention provides a cross-domain hybrid (Mixup) fusion contrast learning method, which has the starting point that low-confidence samples in a target domain are low in similarity with source domain samples, so that the samples are difficult to be classified correctly. The existing contrast learning process only considers the characteristic space structure of the target domain and ignores the domain difference. Therefore, the invention further provides cross-domain hybrid contrast learning for learning the domain invariant features. The comparative learning process can also be seen in fig. 3, which mainly includes:
to a first orderiTaking the sample image with low confidence as an example, the corresponding first enhanced view image and the second enhanced view image are respectively recorded as
Figure 982427DEST_PATH_IMAGE019
And
Figure 803753DEST_PATH_IMAGE020
(ii) a Recording the selected source domain sample image as
Figure 543038DEST_PATH_IMAGE021
And the corresponding third enhanced view image and the fourth enhanced view image are respectively marked as
Figure 72984DEST_PATH_IMAGE022
And
Figure 228021DEST_PATH_IMAGE023
first enhancing the view image
Figure 497329DEST_PATH_IMAGE019
And a third enhanced view image
Figure 204254DEST_PATH_IMAGE022
And mixing to serve as a query image. In order to ensure that low-confidence target domain samples are dominant in mixing, the invention is used for mixing coefficients
Figure 598326DEST_PATH_IMAGE024
Using a max function, obtain
Figure 681688DEST_PATH_IMAGE025
As a new mixing coefficient, the cross-domain mixing is represented as:
Figure 212027DEST_PATH_IMAGE026
Figure 794580DEST_PATH_IMAGE027
Figure 800582DEST_PATH_IMAGE028
wherein the content of the first and second substances,
Figure 563002DEST_PATH_IMAGE024
in order to obtain a mixing factor,
Figure 338060DEST_PATH_IMAGE029
is composed ofBetaThe parameters of the distribution (beta distribution),
Figure 996574DEST_PATH_IMAGE030
the obtained query images are blended.
Inputting the query image to a first image classification model, and inputting a second enhanced view image
Figure 693135DEST_PATH_IMAGE020
And a fourth enhanced view image
Figure 351256DEST_PATH_IMAGE023
Respectively input to the second image classification model. The right side of fig. 3 shows the processing flow, for a single image classification model, firstly feature extraction is performed by the feature extractor, and then corresponding image features are obtained by processing through an L2 Norm normalization function (L2 Norm), which is represented as:
Figure 387345DEST_PATH_IMAGE031
Figure 341395DEST_PATH_IMAGE032
Figure 197355DEST_PATH_IMAGE033
wherein the content of the first and second substances,Ffor the feature extractor in the first image classification model,
Figure 363894DEST_PATH_IMAGE034
classifying a feature extractor in the model for the second image,
Figure 113544DEST_PATH_IMAGE035
and
Figure 379441DEST_PATH_IMAGE036
the extracted corresponding features are obtained;
Figure 83217DEST_PATH_IMAGE037
is a function normalized to the norm of L2,
Figure 53447DEST_PATH_IMAGE038
the method comprises the steps of normalizing the characteristics of a query image by using an L2 norm normalization function to obtain image characteristics;
Figure 532970DEST_PATH_IMAGE039
is as followsiA second enhanced view image corresponding to the respective low confidence sample image,
Figure 359980DEST_PATH_IMAGE040
to normalize the function pair using the L2 norm
Figure 924954DEST_PATH_IMAGE039
The features of (1) are normalized to obtain image features;
Figure 433295DEST_PATH_IMAGE041
to normalize the function pair using the L2 norm
Figure 656073DEST_PATH_IMAGE042
The features of (a) are normalized to obtain image features.
The image features obtained above are original features, the present invention uses classifier weights as class prototypes and uses them as a set of new coordinates to re-represent the original features, in order to better encode task-specific semantic information, fig. 4 shows the process of feature re-representation, which is expressed as:
Figure 263772DEST_PATH_IMAGE043
Figure 643938DEST_PATH_IMAGE044
wherein the content of the first and second substances,
Figure 752708DEST_PATH_IMAGE045
classifying classifiers in a model for a first imageCThe weight of (not updated by the contrast loss),
Figure 534719DEST_PATH_IMAGE046
represents the softmax function;
Figure 313319DEST_PATH_IMAGE047
and
Figure 478984DEST_PATH_IMAGE048
are all the characteristics of the image, and the image,
Figure 266811DEST_PATH_IMAGE049
and with
Figure 903329DEST_PATH_IMAGE050
Are all characterized in a manner that is re-expressed,
Figure 977464DEST_PATH_IMAGE051
classifying classifiers in a model for a second image
Figure 4326DEST_PATH_IMAGE052
T is the transposed symbol,
Figure 658161DEST_PATH_IMAGE053
in order to re-represent the temperature coefficient,
Figure 710037DEST_PATH_IMAGE054
is a function for re-representing the image features.
Will be provided with
Figure 830440DEST_PATH_IMAGE055
Substituting the first expression as the image feature
Figure 672494DEST_PATH_IMAGE047
Will be
Figure 661179DEST_PATH_IMAGE056
And
Figure 944393DEST_PATH_IMAGE057
respectively carry in a second expression as image features
Figure 829172DEST_PATH_IMAGE048
Corresponding derived re-representation feature
Figure 191145DEST_PATH_IMAGE058
Figure 858887DEST_PATH_IMAGE059
And
Figure 121241DEST_PATH_IMAGE060
namely:
Figure 849026DEST_PATH_IMAGE061
Figure 196831DEST_PATH_IMAGE062
Figure 730580DEST_PATH_IMAGE063
wherein the content of the first and second substances,
Figure 722807DEST_PATH_IMAGE058
in order to represent the features for the first re-expression,
Figure 846178DEST_PATH_IMAGE059
as a feature of an image
Figure 291066DEST_PATH_IMAGE056
Corresponding re-representation features;
Figure 628507DEST_PATH_IMAGE057
to normalize the function pair using the L2 norm
Figure 865453DEST_PATH_IMAGE064
The image features of (a) are normalized to obtain image features,
Figure 935040DEST_PATH_IMAGE060
as a feature of an image
Figure 257437DEST_PATH_IMAGE057
The corresponding re-representation feature.
In obtaining re-representation characteristics
Figure 805093DEST_PATH_IMAGE059
And
Figure 601273DEST_PATH_IMAGE060
after that, they are mixed, expressed as:
Figure 966395DEST_PATH_IMAGE065
wherein the content of the first and second substances,
Figure 651455DEST_PATH_IMAGE066
in order to re-characterize the mixture,
Figure 330698DEST_PATH_IMAGE067
for a blending coefficient used when the first enhanced view image is blended with the third enhanced view image.
In the embodiment of the invention
Figure 276657DEST_PATH_IMAGE058
As a query feature, remove
Figure 688047DEST_PATH_IMAGE058
All other re-representation features except for the contrast feature and constructing the contrast loss of the cross-domain mixture, expressed as:
Figure 483571DEST_PATH_IMAGE068
wherein, the first and the second end of the pipe are connected with each other,
Figure 700926DEST_PATH_IMAGE069
in order to query the features of the image,
Figure 642337DEST_PATH_IMAGE066
in order to re-characterize the mixture,
Figure 83682DEST_PATH_IMAGE070
for the corresponding re-representation feature of the second enhanced view image,
Figure 743334DEST_PATH_IMAGE071
for a corresponding re-representation feature of the fourth enhanced view image,
Figure 764380DEST_PATH_IMAGE072
for a memory bankMThe re-representation features corresponding to the second enhanced view image of the other low confidence sample image obtained by the second image classification model are stored in the database;
Figure 186396DEST_PATH_IMAGE073
is a cosine similarity function expressed as:
Figure 205167DEST_PATH_IMAGE074
wherein the content of the first and second substances,
Figure 211170DEST_PATH_IMAGE075
representing cosine similarity functions
Figure 442431DEST_PATH_IMAGE073
Two features of (1).
The core technical points of the present invention can be summarized from three levels: (1) contrast learning is performed using low confidence samples. (2) The input features of contrast loss need to be re-represented. (3) The cross-domain Mixup technology is merged on the basis of contrast learning.
And, the following beneficial effects are mainly obtained: (1) on the basis of an original domain adaptation method using a target domain high-confidence sample, a target domain low-confidence sample is fully used, and suboptimal domain migration effect of a model due to the fact that the model is biased to a sample which is close to a source domain in the target domain is prevented. (2) The classifier weights are used for representing the original features again instead of being directly used, so that semantic information specific to the task is better encoded. (3) Cross-domain mixing is used for low-confidence samples, the low-confidence samples are dominant in the cross-domain mixing, the domain difference is reduced, and the model can better learn the invariant features of the domain. In general, the method utilizes low-confidence samples, and improves the accuracy of unsupervised domain adaptation and semi-supervised domain adaptation image classification.
And secondly, base loss.
To complete the optimization function, the associated basis penalty is introduced below. First, there is a cross-entropy loss on the labeled samples
Figure 420751DEST_PATH_IMAGE076
And loss for cross-domain alignment features
Figure 469478DEST_PATH_IMAGE077
. On the basis, a semi-supervised learning algorithm (FixMatch) based on a pseudo tag technology is added to strengthen the learning process of the high-confidence sample, so that the prediction consistency is improved, the reliable pseudo tag is provided, and a KLD (Kullback-Leibler divergence) regular term in the high-confidence sample is introduced at the same time
Figure 838143DEST_PATH_IMAGE078
And, andcross entropy loss of high confidence samples after using FixMatch
Figure 761843DEST_PATH_IMAGE079
. Thus, the base loss is expressed as:
Figure 266774DEST_PATH_IMAGE080
wherein, the first and the second end of the pipe are connected with each other,
Figure 689665DEST_PATH_IMAGE081
in order to have a set of annotated images,
Figure 670259DEST_PATH_IMAGE082
for a single annotated image, a collection of annotated images
Figure 508902DEST_PATH_IMAGE081
Corresponding to the source domain adapted to the unsupervised domain, the source domain adapted to the semi-supervised domain and the labeled part of the target domain (namely including all the labeled images in the source domain image set and the target domain image set);
Figure 258552DEST_PATH_IMAGE083
is composed of
Figure 524449DEST_PATH_IMAGE081
With the target domain image set
Figure 431487DEST_PATH_IMAGE084
The union of (a) and (b),
Figure 198455DEST_PATH_IMAGE085
is composed of
Figure 943557DEST_PATH_IMAGE083
The image of (2) is a single image,
Figure 504988DEST_PATH_IMAGE086
a set of high-confidence samples is represented,
Figure 397858DEST_PATH_IMAGE087
for a single high confidence sample image, the high confidence sample set is a set formed by the residual images in the target domain image set after the low confidence sample set is removed, specifically, the maximum probability of the target domain sample image output by the second image classification model is larger than the threshold value
Figure 578304DEST_PATH_IMAGE001
Figure 801081DEST_PATH_IMAGE088
For loss of cross-domain alignment features
Figure 736676DEST_PATH_IMAGE089
The weight coefficient of (a) is,
Figure 788946DEST_PATH_IMAGE090
for KLD regularization term in low confidence samples
Figure 897716DEST_PATH_IMAGE091
The weight coefficient of (2).
Wherein, the first and the second end of the pipe are connected with each other,
Figure 351831DEST_PATH_IMAGE089
may be a loss calculated by other common domain adaptation methods (e.g., domain difference metric loss MMD, domain confrontation loss, etc.), and the present invention is not particularly limited.
Cross entropy loss
Figure 458327DEST_PATH_IMAGE076
Is also a conventional loss in the form of:
Figure 623992DEST_PATH_IMAGE092
wherein the content of the first and second substances,
Figure 411819DEST_PATH_IMAGE093
is shown to beAnnotation image
Figure 313916DEST_PATH_IMAGE082
The output class of the classifier in the first image classification model iskThe probability of (a) of (b) being,
Figure 591314DEST_PATH_IMAGE094
for marked images
Figure 742809DEST_PATH_IMAGE082
The category label of (a) is set,
Figure 68748DEST_PATH_IMAGE095
represents the number of categories, expressed as:
Figure 58308DEST_PATH_IMAGE096
Figure 303344DEST_PATH_IMAGE097
for temperature parameters (e.g. settings) of the classifier
Figure 83082DEST_PATH_IMAGE098
)。
KLD regularization term in high confidence samples
Figure 71766DEST_PATH_IMAGE091
And cross-entropy loss of high-confidence samples after FixMatch usage
Figure 417297DEST_PATH_IMAGE079
The calculation is performed by a FixMatch model with a regular term, and the calculation process is as shown in FIG. 5, and includes:
definition of
Figure 239759DEST_PATH_IMAGE099
And
Figure 804995DEST_PATH_IMAGE100
separately representing samples from a set of high confidence samples
Figure 597371DEST_PATH_IMAGE086
Single high confidence sample image
Figure 62987DEST_PATH_IMAGE087
Two different enhanced view images (the former being a weak enhanced view image and the latter being a strong enhanced view image);
Figure 56351DEST_PATH_IMAGE099
inputting the data into a second image classification model (the upper half part of figure 5), obtaining a second classification result through feature extraction and classification, and constructing a pseudo label
Figure 341839DEST_PATH_IMAGE101
Figure 702019DEST_PATH_IMAGE100
Inputting the image data into a first image classification model (the lower half part of fig. 5), obtaining a first classification result through feature extraction and classification, and calculating a KLD regular term in a high-confidence sample by using the first classification result
Figure 694246DEST_PATH_IMAGE091
And, calculating the cross-entropy loss of the high-confidence samples after using the FixMatch using the first classification result and the corresponding pseudo-label
Figure 655249DEST_PATH_IMAGE102
KLD regularization term in high confidence samples
Figure 755929DEST_PATH_IMAGE091
And cross-entropy loss of high-confidence samples after FixMatch usage
Figure 93369DEST_PATH_IMAGE102
Is expressed as:
Figure 940103DEST_PATH_IMAGE103
Figure 901368DEST_PATH_IMAGE104
wherein the content of the first and second substances,
Figure 833552DEST_PATH_IMAGE105
in order to indicate the function,
Figure 974683DEST_PATH_IMAGE106
the number of the categories is indicated and,
Figure 331715DEST_PATH_IMAGE107
representing i.e. strongly enhanced view images
Figure 306624DEST_PATH_IMAGE100
The class output by the classifier in the first image classification model isjThe probability of (a) of (b) being,
Figure 319580DEST_PATH_IMAGE108
representing i.e. strongly enhanced view images
Figure 559675DEST_PATH_IMAGE100
The class output by the classifier in the first image classification model is
Figure 708896DEST_PATH_IMAGE101
Probability of, pseudo label
Figure 854707DEST_PATH_IMAGE109
The class label corresponding to the maximum probability in the second classification result,
Figure 886117DEST_PATH_IMAGE110
representing the maximum probability of prediction of the second image classification model
Figure 41155DEST_PATH_IMAGE111
Greater than a threshold value
Figure 107200DEST_PATH_IMAGE001
And thirdly, a total loss function.
In the embodiment of the present invention, the total loss function is constructed by integrating the above-mentioned comparison loss and the basic loss, and is expressed as:
Figure 689491DEST_PATH_IMAGE112
wherein the content of the first and second substances,
Figure 709662DEST_PATH_IMAGE113
in order to take the basis of the loss,
Figure 730707DEST_PATH_IMAGE114
in order to compare the losses of the process,
Figure 792204DEST_PATH_IMAGE115
in order to compare the weight coefficients of the losses,
Figure 670030DEST_PATH_IMAGE116
is a mathematical expectation symbol;
Figure 285819DEST_PATH_IMAGE117
for a source domain image set
Figure 110556DEST_PATH_IMAGE118
And low confidence sample set
Figure 649728DEST_PATH_IMAGE119
The union of (a) and (b),
Figure 573822DEST_PATH_IMAGE120
is composed of
Figure 270382DEST_PATH_IMAGE117
Of the image data.
Based on the scheme, the following provides an integrated training and testing process introduction, and the main steps comprise:
step 1, preparing a training data set marked by a source domain and a training set of a target domain, and testing. For the training set images of the source domain and the target domain, two enhancements are constructed online: the method comprises the steps of strong enhancement and weak enhancement, wherein the strong enhancement adopts a random data enhancement method (RandAugment), and the weak enhancement adopts common random cutting and water machine horizontal turning. After image processing, the size of the image is scaled to 224 × 224, and then subjected to numerical normalization processing. The images obtained by strong enhancement and weak enhancement are the two different enhancement view images mentioned above, specifically, the first enhancement view image and the third enhancement view image are constructed by using a strong enhancement mode, and the second enhancement view image and the fourth enhancement view image are constructed by using a weak enhancement mode.
And 2, establishing a contrast learning method based on a low confidence sample by using a Pythroch deep learning frame. The model is composed of a teacher model and a student model, the teacher model and the student model have the same structure and initialization parameters, the student model is updated through gradient back propagation, and the teacher model is an exponential sliding average of the parameters of the student model. The model structure adopts common image classification models, such as ResNet34, ResNet50 and the like, and here, the classifier of the classification model is changed into a calculation mode based on cosine similarity. In the comparison learning process, an additional memory bank is used for storing the features generated by the processed low confidence coefficient samples of the target domain, the capacity of the memory bank is 512, a first-in first-out updating mode is adopted, and updating is carried out after the iteration of each batch of samples is finished.
Step 3, inputting the source domain image to a student model, outputting the prediction probability, using the labeling information of the source domain to perform supervised learning, and using the training data of the source domain and the target domain to perform alignment loss
Figure 695548DEST_PATH_IMAGE121
And (4) calculating.
And 4, inputting the image with weak enhancement to the teacher model and inputting the image with strong enhancement to the student model for the target domain image. Outputting the prediction probability according to a given threshold
Figure 528374DEST_PATH_IMAGE001
And determining whether the input sample is a high-confidence sample (the maximum probability predicted by the teacher model is greater than a threshold), if so, generating a pseudo label by using a FixMatch learning mode and a weakly enhanced image, and supervising the strongly enhanced image.
Step 5, mixing different enhanced images of the low confidence sample with the enhanced versions of the randomly sampled source domain images, inputting the mixed different enhanced samples into a student model and a teacher model respectively, generating new features by the output feature intermediate features through a re-representation module, taking the first re-representation features as query features, and combining the mixed re-representation features, the re-representation features corresponding to the second enhanced view image and the re-representation features corresponding to the fourth enhanced view image to construct a positive sample pair for comparative learning, wherein the positive sample pair specifically comprises
Figure 888949DEST_PATH_IMAGE122
And
Figure 105429DEST_PATH_IMAGE123
Figure 271968DEST_PATH_IMAGE122
and
Figure 162563DEST_PATH_IMAGE124
Figure 756356DEST_PATH_IMAGE122
and
Figure 224246DEST_PATH_IMAGE125
then using the features stored in the memory bank
Figure 601001DEST_PATH_IMAGE126
And (5) constructing a comparison learning loss as a negative sample, and updating the student model.
And 6, updating the memory base by using the characteristics of the target domain after the low-confidence sample is re-represented.
And 7, accumulating the loss functions in the steps 3 and 5, minimizing the loss functions through a back propagation algorithm and a gradient descent strategy, updating the weight of the student model, and updating the parameters of the teacher model through the parameters of the student model.
And 8, inputting a test data set and calculating the classification accuracy of the student model.
Example two
The invention further provides a domain adaptive learning system based on low-confidence sample contrast loss, which is implemented mainly based on the method provided by the first embodiment, as shown in fig. 6, the system mainly includes:
the low confidence sample set generating unit is used for screening out a low confidence sample set from the target domain image set according to a set threshold;
the enhanced view image generation unit is used for obtaining two different enhanced view images, namely a first enhanced view image and a second enhanced view image, in each low confidence sample image by using a data enhancement method, randomly selecting a source domain sample image from a source domain image set, and obtaining two different enhanced view images, namely a third enhanced view image and a fourth enhanced view image by using the data enhancement method;
the re-representation feature acquisition unit is used for mixing the first enhanced view image and the third enhanced view image to form a query image, inputting the query image into a first image classification model, extracting image features through the first image classification model, and re-representing to obtain a first re-representation feature; inputting the second enhanced view image and the fourth enhanced view image into a second image classification model, and respectively performing image feature extraction and re-representation through the second image classification model to obtain corresponding re-representation features; blending the first re-representation feature with a re-representation feature corresponding to the fourth enhanced view image to form a blended re-representation feature;
and the total loss function construction and model training unit is used for constructing a comparison loss by using the difference between the query feature and each comparison feature and training the first image classification model by combining the basic loss construction total loss function of the first image classification model by using the first re-representation feature as a query feature and all the rest re-representation features as comparison features.
It will be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the system is divided into different functional modules to perform all or part of the above described functions.
EXAMPLE III
The present invention also provides a processing apparatus, as shown in fig. 7, which mainly includes: one or more processors; a memory for storing one or more programs; wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the methods provided by the foregoing embodiments.
Further, the processing device further comprises at least one input device and at least one output device; in the processing device, a processor, a memory, an input device and an output device are connected through a bus.
In the embodiment of the present invention, the specific types of the memory, the input device, and the output device are not limited; for example:
the input device can be a touch screen, an image acquisition device, a physical button or a mouse and the like;
the output device may be a display terminal;
the Memory may be a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as a disk Memory.
Example four
The present invention also provides a readable storage medium storing a computer program which, when executed by a processor, implements the method provided by the foregoing embodiments.
The readable storage medium in the embodiment of the present invention may be provided in the foregoing processing device as a computer readable storage medium, for example, as a memory in the processing device. The readable storage medium may be various media that can store program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A domain adaptive learning method based on low confidence sample contrast loss is characterized by comprising the following steps:
screening out a low-confidence sample set from the target domain image set according to a set threshold;
for each low confidence sample image, obtaining two different enhanced view images, namely a first enhanced view image and a second enhanced view image, by using a data enhancement method, randomly selecting a source domain sample image from a source domain image set, and obtaining two different enhanced view images, namely a third enhanced view image and a fourth enhanced view image, by using the data enhancement method;
mixing the first enhanced view image and the third enhanced view image to form a query image, inputting the query image into a first image classification model, and performing image feature extraction and re-representation through the first image classification model to obtain a first re-representation feature; inputting the second enhanced view image and the fourth enhanced view image into a second image classification model respectively, and extracting and re-representing image features through the second image classification model respectively to obtain corresponding re-represented features; blending the second enhanced view image with the corresponding re-representation feature of the fourth enhanced view image to form a blended re-representation feature;
and taking the first re-representation feature as a query feature, taking all the rest re-representation features as comparison features, constructing a comparison loss by using the difference between the query feature and each comparison feature, and constructing a total loss function by combining the basic loss of the first image classification model to train the first image classification model.
2. The method of claim 1, wherein the first enhanced view image and the third enhanced view image are mixed in a manner represented by:
Figure 279834DEST_PATH_IMAGE001
Figure 44528DEST_PATH_IMAGE002
Figure 256108DEST_PATH_IMAGE003
wherein the content of the first and second substances,
Figure 702132DEST_PATH_IMAGE004
in order to obtain a mixing factor,
Figure 426375DEST_PATH_IMAGE005
as a parameter of the Beta distribution,
Figure 34074DEST_PATH_IMAGE006
is the new mixing coefficient obtained by the max function;
Figure 148660DEST_PATH_IMAGE007
is a firstiA first enhanced view image corresponding to each low confidence sample image,
Figure 257431DEST_PATH_IMAGE008
for source domain sample images
Figure 977125DEST_PATH_IMAGE009
A corresponding third enhanced view image is displayed on the display,
Figure 381824DEST_PATH_IMAGE010
the obtained query images are blended.
3. The method of claim 1, wherein the query image is input into a first image classification model, and image feature extraction and re-representation are performed by the first image classification model to obtain a first re-representation feature; inputting the second enhanced view image and the fourth enhanced view image into a second image classification model respectively, and extracting and re-representing image features through the second image classification model respectively to obtain corresponding re-represented features; the process of blending the second enhanced view image with the re-representation features corresponding to the fourth enhanced view image to form a blended re-representation feature is represented as:
Figure 983706DEST_PATH_IMAGE011
Figure 771534DEST_PATH_IMAGE012
Figure 204789DEST_PATH_IMAGE013
Figure 154291DEST_PATH_IMAGE014
wherein the content of the first and second substances,Ffor the feature extractor in the first image classification model,
Figure 509049DEST_PATH_IMAGE015
for the feature extractor in the second image classification model,
Figure 458157DEST_PATH_IMAGE016
is a function normalized to the norm of L2,
Figure 886864DEST_PATH_IMAGE017
is a function for re-representing image features;
Figure 335163DEST_PATH_IMAGE010
in order to query the image(s),
Figure 708375DEST_PATH_IMAGE018
to obtain image features after normalizing the features of the query image using the L2 norm normalization function,
Figure 838005DEST_PATH_IMAGE019
characterizing the first re-representation;
Figure 245853DEST_PATH_IMAGE020
is as followsiA second enhanced view image corresponding to the respective low confidence sample image,
Figure 366518DEST_PATH_IMAGE021
to normalize the function pair using the L2 norm
Figure 102393DEST_PATH_IMAGE020
The features of (a) are normalized to obtain image features,
Figure 160347DEST_PATH_IMAGE022
as a feature of an image
Figure 298068DEST_PATH_IMAGE021
Corresponding re-representation features;
Figure 88169DEST_PATH_IMAGE023
for source domain sample images
Figure 435974DEST_PATH_IMAGE009
A corresponding fourth enhanced view image is displayed on the display,
Figure 172986DEST_PATH_IMAGE024
to normalize the function pair using the L2 norm
Figure 788382DEST_PATH_IMAGE023
The features of (a) are normalized to obtain image features,
Figure 749384DEST_PATH_IMAGE025
as a feature of an image
Figure 459851DEST_PATH_IMAGE024
Corresponding re-representation features;
Figure 594029DEST_PATH_IMAGE026
in order to re-characterize the mixture,
Figure 706342DEST_PATH_IMAGE027
for a blending coefficient used when the first enhanced view image is blended with the third enhanced view image.
4. The method of claim 3, wherein the function for re-representing image features is a function of a low confidence sample contrast loss
Figure 103825DEST_PATH_IMAGE017
Is shown as:
Figure 662108DEST_PATH_IMAGE028
Figure 740922DEST_PATH_IMAGE029
Wherein the content of the first and second substances,
Figure 770058DEST_PATH_IMAGE030
classifying classifiers in a model for a first imageCThe weight of (a) is calculated,
Figure 135180DEST_PATH_IMAGE031
represents the softmax function;
Figure 820240DEST_PATH_IMAGE032
Figure 296220DEST_PATH_IMAGE033
classifying classifiers in a model for a second image
Figure 117546DEST_PATH_IMAGE034
T is the transposed symbol,
Figure 355367DEST_PATH_IMAGE035
is the temperature coefficient at the time of re-expression.
5. The method of claim 1 or 3, wherein the contrast loss is constructed by using the difference between the query feature and each contrast feature as:
Figure 386777DEST_PATH_IMAGE036
wherein the content of the first and second substances,
Figure 541815DEST_PATH_IMAGE037
in order to query the features of the image,
Figure 342280DEST_PATH_IMAGE026
in order to re-characterize the mixture,
Figure 924571DEST_PATH_IMAGE022
for the corresponding re-representation feature of the second enhanced view image,
Figure 443277DEST_PATH_IMAGE025
for a corresponding re-representation feature of the fourth enhanced view image,
Figure 402006DEST_PATH_IMAGE038
for a memory bankMThe re-representation features corresponding to second enhanced view images of other low confidence sample images obtained by the second image classification model are stored in the database;
Figure 824022DEST_PATH_IMAGE039
is a cosine similarity function.
6. The method of claim 1, wherein the total loss function is expressed as:
Figure 311636DEST_PATH_IMAGE040
wherein the content of the first and second substances,
Figure 520900DEST_PATH_IMAGE041
in order to take the basis of the loss,
Figure 142374DEST_PATH_IMAGE042
in order to compare the losses of the process,
Figure 792798DEST_PATH_IMAGE043
in order to compare the weight coefficients of the losses,
Figure 841526DEST_PATH_IMAGE044
is a mathematical expectation symbol;
Figure 944611DEST_PATH_IMAGE045
for a source domain image set
Figure 868311DEST_PATH_IMAGE046
And low confidence sample set
Figure 638821DEST_PATH_IMAGE047
The union of (a) and (b),
Figure 858450DEST_PATH_IMAGE048
is composed of
Figure 448831DEST_PATH_IMAGE045
A single image of (a);
the base loss includes: cross entropy loss on annotated images
Figure 412108DEST_PATH_IMAGE049
Loss of features for cross-domain alignment
Figure 365021DEST_PATH_IMAGE050
KLD regularization term in high confidence samples
Figure 630917DEST_PATH_IMAGE051
And cross-entropy loss of high-confidence samples after FixMatch usage
Figure 334693DEST_PATH_IMAGE052
FixMatch represents a technique based on pseudo tagsSemi-supervised learning algorithms for surgery; the base loss is expressed as:
Figure 711448DEST_PATH_IMAGE053
wherein the content of the first and second substances,
Figure 315604DEST_PATH_IMAGE054
in order to have a set of annotated images,
Figure 17981DEST_PATH_IMAGE055
for a single annotated image, a collection of annotated images
Figure 707588DEST_PATH_IMAGE054
All the labeled images in the source domain image set and the target domain image set are included;
Figure 888034DEST_PATH_IMAGE056
is composed of
Figure 346697DEST_PATH_IMAGE054
With the target domain image set
Figure 954396DEST_PATH_IMAGE057
The union of (a) and (b),
Figure 833097DEST_PATH_IMAGE058
is composed of
Figure 941867DEST_PATH_IMAGE056
The number of individual images in (1) is,
Figure 130403DEST_PATH_IMAGE059
a set of high-confidence samples is represented,
Figure 33637DEST_PATH_IMAGE060
for a single high-confidence sample imageThe high confidence sample set is a set formed by the residual images in the target domain image set after the low confidence sample set is removed;
Figure 838782DEST_PATH_IMAGE061
for loss of cross-domain alignment features
Figure 485664DEST_PATH_IMAGE062
The weight coefficient of (a) is,
Figure 794286DEST_PATH_IMAGE063
regularization term for KLD in low confidence samples
Figure 573148DEST_PATH_IMAGE064
The weight coefficient of (2).
7. The method of claim 6, wherein KLD regularization term in the high confidence samples is used as a basis for domain adaptive learning based on contrast loss of low confidence samples
Figure 724644DEST_PATH_IMAGE064
And cross-entropy loss of high-confidence samples after FixMatch usage
Figure 785004DEST_PATH_IMAGE052
The calculation method comprises the following steps:
definition of
Figure 72766DEST_PATH_IMAGE065
And
Figure 724327DEST_PATH_IMAGE066
separately representing samples from a set of high confidence samples
Figure 97539DEST_PATH_IMAGE059
Single high confidence sample image
Figure 961590DEST_PATH_IMAGE060
Two different enhanced view images;
Figure 71235DEST_PATH_IMAGE065
inputting the image data into a second image classification model, obtaining a second classification result through feature extraction and classification, and constructing a pseudo label
Figure 487173DEST_PATH_IMAGE067
Figure 488627DEST_PATH_IMAGE066
Inputting the image data into a first image classification model, obtaining a first classification result through feature extraction and classification, and calculating a KLD regular term in a high-confidence sample by using the first classification result
Figure 281003DEST_PATH_IMAGE064
And, calculating the cross-entropy loss of the high-confidence samples after using the FixMatch using the first classification result and the corresponding pseudo-label
Figure 746619DEST_PATH_IMAGE052
KLD regularization term in high confidence samples
Figure 208824DEST_PATH_IMAGE064
And cross-entropy loss of high-confidence samples after FixMatch usage
Figure 58094DEST_PATH_IMAGE052
Is expressed as:
Figure 263947DEST_PATH_IMAGE068
Figure 380808DEST_PATH_IMAGE069
wherein, the first and the second end of the pipe are connected with each other,
Figure 545073DEST_PATH_IMAGE070
in order to indicate the function,
Figure 380174DEST_PATH_IMAGE071
the number of the categories is indicated and,
Figure 452035DEST_PATH_IMAGE072
representing i.e. enhancing a view image
Figure 564347DEST_PATH_IMAGE066
The class output by the classifier in the first image classification model isjThe probability of (a) of (b) being,
Figure 991524DEST_PATH_IMAGE073
representing i.e. enhancing a view image
Figure 189287DEST_PATH_IMAGE066
The class output by the classifier in the first image classification model is
Figure 127156DEST_PATH_IMAGE067
Probability of, pseudo label
Figure 93975DEST_PATH_IMAGE074
The class label corresponding to the maximum probability in the second classification result,
Figure 459098DEST_PATH_IMAGE075
representing the maximum probability of prediction of the second image classification model
Figure 206474DEST_PATH_IMAGE076
Greater than a threshold value
Figure 557821DEST_PATH_IMAGE077
8. A domain adaptive learning system based on low confidence sample contrast loss, which is realized based on the method of any one of claims 1 to 7, and comprises:
the low confidence sample set generating unit is used for screening out a low confidence sample set from the target domain image set according to a set threshold;
the enhanced view image generation unit is used for obtaining two different enhanced view images, namely a first enhanced view image and a second enhanced view image, in each low confidence sample image by using a data enhancement method, randomly selecting a source domain sample image from a source domain image set, and obtaining two different enhanced view images, namely a third enhanced view image and a fourth enhanced view image by using the data enhancement method;
a re-representation feature obtaining unit, configured to mix the first enhanced view image and the third enhanced view image to obtain a query image, input the query image into a first image classification model, perform image feature extraction through the first image classification model, and perform re-representation to obtain a first re-representation feature; inputting the second enhanced view image and the fourth enhanced view image into a second image classification model, and respectively performing image feature extraction and re-representation through the second image classification model to obtain corresponding re-representation features; blending the first re-representation feature with a re-representation feature corresponding to the fourth enhanced view image to form a blended re-representation feature;
and the total loss function construction and model training unit is used for constructing a comparison loss by using the difference between the query feature and each comparison feature and combining the basic loss construction total loss function of the first image classification model to train the first image classification model by using the first re-representation feature as a query feature and all the rest re-representation features as comparison features.
9. A processing device, comprising: one or more processors; a memory for storing one or more programs;
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-7.
10. A readable storage medium, storing a computer program, characterized in that the computer program, when executed by a processor, implements the method according to any of claims 1 to 7.
CN202210942337.9A 2022-08-08 2022-08-08 Domain adaptive learning method and system based on low confidence sample contrast loss Active CN114998602B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210942337.9A CN114998602B (en) 2022-08-08 2022-08-08 Domain adaptive learning method and system based on low confidence sample contrast loss

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210942337.9A CN114998602B (en) 2022-08-08 2022-08-08 Domain adaptive learning method and system based on low confidence sample contrast loss

Publications (2)

Publication Number Publication Date
CN114998602A true CN114998602A (en) 2022-09-02
CN114998602B CN114998602B (en) 2022-12-30

Family

ID=83023178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210942337.9A Active CN114998602B (en) 2022-08-08 2022-08-08 Domain adaptive learning method and system based on low confidence sample contrast loss

Country Status (1)

Country Link
CN (1) CN114998602B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452897A (en) * 2023-06-16 2023-07-18 中国科学技术大学 Cross-domain small sample classification method, system, equipment and storage medium
CN116543237A (en) * 2023-06-27 2023-08-04 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Image classification method, system, equipment and medium for non-supervision domain adaptation of passive domain
CN117253097A (en) * 2023-11-20 2023-12-19 中国科学技术大学 Semi-supervision domain adaptive image classification method, system, equipment and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108256561A (en) * 2017-12-29 2018-07-06 中山大学 A kind of multi-source domain adaptive migration method and system based on confrontation study
WO2019228358A1 (en) * 2018-05-31 2019-12-05 华为技术有限公司 Deep neural network training method and apparatus
WO2021097055A1 (en) * 2019-11-14 2021-05-20 Nec Laboratories America, Inc. Domain adaptation for semantic segmentation via exploiting weak labels
CN113436197A (en) * 2021-06-07 2021-09-24 华东师范大学 Domain-adaptive unsupervised image segmentation method based on generation of confrontation and class feature distribution
CN113435546A (en) * 2021-08-26 2021-09-24 广东众聚人工智能科技有限公司 Migratable image recognition method and system based on differentiation confidence level
CN113553906A (en) * 2021-06-16 2021-10-26 之江实验室 Method for discriminating unsupervised cross-domain pedestrian re-identification based on class center domain alignment
US20210334664A1 (en) * 2020-04-24 2021-10-28 Adobe Inc. Domain Adaptation for Machine Learning Models
EP3940632A1 (en) * 2019-03-14 2022-01-19 Navier Inc. Image processing learning program, image processing program, image processing device, and image processing system
CN114283287A (en) * 2022-03-09 2022-04-05 南京航空航天大学 Robust field adaptive image learning method based on self-training noise label correction
CN114332568A (en) * 2022-03-16 2022-04-12 中国科学技术大学 Training method, system, equipment and storage medium of domain adaptive image classification network
CN114492574A (en) * 2021-12-22 2022-05-13 中国矿业大学 Pseudo label loss unsupervised countermeasure domain adaptive picture classification method based on Gaussian uniform mixing model
CN114842267A (en) * 2022-05-23 2022-08-02 南京邮电大学 Image classification method and system based on label noise domain self-adaption

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108256561A (en) * 2017-12-29 2018-07-06 中山大学 A kind of multi-source domain adaptive migration method and system based on confrontation study
WO2019228358A1 (en) * 2018-05-31 2019-12-05 华为技术有限公司 Deep neural network training method and apparatus
EP3940632A1 (en) * 2019-03-14 2022-01-19 Navier Inc. Image processing learning program, image processing program, image processing device, and image processing system
WO2021097055A1 (en) * 2019-11-14 2021-05-20 Nec Laboratories America, Inc. Domain adaptation for semantic segmentation via exploiting weak labels
US20210334664A1 (en) * 2020-04-24 2021-10-28 Adobe Inc. Domain Adaptation for Machine Learning Models
CN113436197A (en) * 2021-06-07 2021-09-24 华东师范大学 Domain-adaptive unsupervised image segmentation method based on generation of confrontation and class feature distribution
CN113553906A (en) * 2021-06-16 2021-10-26 之江实验室 Method for discriminating unsupervised cross-domain pedestrian re-identification based on class center domain alignment
CN113435546A (en) * 2021-08-26 2021-09-24 广东众聚人工智能科技有限公司 Migratable image recognition method and system based on differentiation confidence level
CN114492574A (en) * 2021-12-22 2022-05-13 中国矿业大学 Pseudo label loss unsupervised countermeasure domain adaptive picture classification method based on Gaussian uniform mixing model
CN114283287A (en) * 2022-03-09 2022-04-05 南京航空航天大学 Robust field adaptive image learning method based on self-training noise label correction
CN114332568A (en) * 2022-03-16 2022-04-12 中国科学技术大学 Training method, system, equipment and storage medium of domain adaptive image classification network
CN114842267A (en) * 2022-05-23 2022-08-02 南京邮电大学 Image classification method and system based on label noise domain self-adaption

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ALEJO M ET AL: "Unconstrained Ear Recognition through Domain Adaptive Deep Learning Models of Convolutional Neural Network", 《INTERNATIONAL JOURNAL OF RECENT TECHNOLOGY AND ENGINEERING》 *
ZILEI WANG ET AL: "Image Classification via Object-aware Holistic Superpixel Selection", 《IEEE TRANSACTION ON IMAGE PROCESSING (TIP)》 *
吴子锐等: "面向特征生成的无监督域适应算法", 《电子科技大学学报》 *
王小明等: "内容结构保持的图像风格迁移方法", 《计算机工程与应用》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452897A (en) * 2023-06-16 2023-07-18 中国科学技术大学 Cross-domain small sample classification method, system, equipment and storage medium
CN116452897B (en) * 2023-06-16 2023-10-20 中国科学技术大学 Cross-domain small sample classification method, system, equipment and storage medium
CN116543237A (en) * 2023-06-27 2023-08-04 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Image classification method, system, equipment and medium for non-supervision domain adaptation of passive domain
CN116543237B (en) * 2023-06-27 2023-11-28 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Image classification method, system, equipment and medium for non-supervision domain adaptation of passive domain
CN117253097A (en) * 2023-11-20 2023-12-19 中国科学技术大学 Semi-supervision domain adaptive image classification method, system, equipment and storage medium
CN117253097B (en) * 2023-11-20 2024-02-23 中国科学技术大学 Semi-supervision domain adaptive image classification method, system, equipment and storage medium

Also Published As

Publication number Publication date
CN114998602B (en) 2022-12-30

Similar Documents

Publication Publication Date Title
Dong et al. Peco: Perceptual codebook for bert pre-training of vision transformers
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN111126488B (en) Dual-attention-based image recognition method
CN114998602B (en) Domain adaptive learning method and system based on low confidence sample contrast loss
JP5373536B2 (en) Modeling an image as a mixture of multiple image models
Lin et al. A post-processing method for detecting unknown intent of dialogue system via pre-trained deep neural network classifier
CN114332568B (en) Training method, system, equipment and storage medium of domain adaptive image classification network
CN109344884A (en) The method and device of media information classification method, training picture classification model
CN110188195B (en) Text intention recognition method, device and equipment based on deep learning
CN111741330A (en) Video content evaluation method and device, storage medium and computer equipment
CN112749274B (en) Chinese text classification method based on attention mechanism and interference word deletion
CN111241992B (en) Face recognition model construction method, recognition method, device, equipment and storage medium
CN114038055A (en) Image generation method based on contrast learning and generation countermeasure network
CN114329034A (en) Image text matching discrimination method and system based on fine-grained semantic feature difference
CN115563327A (en) Zero sample cross-modal retrieval method based on Transformer network selective distillation
CN110111365B (en) Training method and device based on deep learning and target tracking method and device
CN115270752A (en) Template sentence evaluation method based on multilevel comparison learning
CN113255832B (en) Method for identifying long tail distribution of double-branch multi-center
CN114722892A (en) Continuous learning method and device based on machine learning
CN110795410A (en) Multi-field text classification method
CN113792659A (en) Document identification method and device and electronic equipment
CN113762005A (en) Method, device, equipment and medium for training feature selection model and classifying objects
CN114357221B (en) Self-supervision active learning method based on image classification
Sarang Thinking Data Science: A Data Science Practitioner’s Guide
CN113344069B (en) Image classification method for unsupervised visual representation learning based on multi-dimensional relation alignment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant