CN112819098A

CN112819098A - Domain self-adaption method based on triple and difference measurement

Info

Publication number: CN112819098A
Application number: CN202110220887.5A
Authority: CN
Inventors: 胡海峰; 杨岩; 吴建盛; 朱燕翔
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2021-05-18

Abstract

The invention provides a domain self-adaptive method based on a triple and difference measurement, which comprises the steps of randomly extracting samples from a target domain to form a target domain batch, and inputting the samples into a feature extractor to obtain sample features; inputting the sample characteristics into a multi-classifier, and performing entropy minimization processing; inputting the samples into at most two classifiers at the same time, and determining k critical samples and k pairs of similar classes according to the output; then screening effective samples by utilizing triple loss to construct a source domain batch, and training a multi-two classifier and a multi-classifier through the extracted source domain batch sample; finally, sending the target domain batch and the source domain batch into a domain confrontation network for domain alignment operation; according to the method, the triple loss function is used, the margin between the positive sample pair and the negative sample pair in the loss is reasonably designed, the domain alignment is carried out by utilizing the domain countermeasure network, so that the sample distribution of the source domain and the target domain tends to be consistent, and the sample of the target domain close to the classification boundary is indirectly far away from the boundary, so that the sample of the target domain close to the classification boundary can be correctly classified.

Description

Domain self-adaption method based on triple and difference measurement

Technical Field

The invention relates to the technical field of machine learning, in particular to a domain self-adaption method based on a triple and difference measurement.

Background

With the continuous development of machine learning, especially the development of deep learning in the sub-field, many machine learning tasks and computer vision applications are greatly improved in performance, however, it is also promising that enough labeled data are needed to support us, and an effective model can be trained to solve some practical problems according to the labeled data. However, in an actual scene, a large amount of tagged data is difficult to obtain, which takes a large amount of manpower and material resources, so it is very critical how to find an effective method to solve the problem of tag data loss.

In the case of label data loss, the domain adaptive method is a reasonable solution. The domain adaptation is a branch content of the migration learning, and aims to hope that an effective model can be trained by means of abundant labeled data of a source domain, and the model can be effectively applied to a target domain, wherein data in the target domain has no label or a small amount of data contains a label. At the same time, we also note that the source domain and the target domain must have the same class space and learning task.

Domain adaptation mainly solves the problem of how to reduce the difference between sample distributions of a source domain and a target domain, and a plurality of domain adaptation methods are proposed on the basis, and nowadays, most of representative methods are resistance domain adaptation methods, which use the basic idea of generating a countermeasure network (GAN) for reference, and aim to make the sample distributions of the source domain and the target domain consistent, so that a model trained by the source domain can be well applied to the target domain. However, none of these methods based on the domain confrontation idea can solve the problem of the proximity of the similar samples of the target domain, and these similar samples are very close to the classification boundaries of the two classes in the feature space, which makes it very easy to misjudge some target domain samples as the class similar to them in the process of judgment. In the prior art, there is no disclosure of how to effectively solve the problem of distinguishing similar samples of a target domain by a domain adaptive method based on a domain confrontation thought.

Disclosure of Invention

The purpose of the invention is as follows: the invention provides a Domain self-adaption method based on a triple and a difference measure, which is characterized in that a triple loss function is used for triplets loss, and difference margin between positive and negative sample pairs in the loss is reasonably designed, the distances between samples of the same type in a source Domain are shortened, the distances between samples of different types are lengthened in a feature space, and then Domain alignment is carried out by utilizing a Domain countermeasure Network (DANN) so that the sample distribution of the source Domain and the sample distribution of a target Domain tend to be consistent, and the samples of the target Domain close to a classification boundary are indirectly far away from the boundary, so that the samples of the target Domain close to the classification boundary can be correctly classified.

The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:

a domain adaptation method based on triplet and deficit metrics, comprising the steps of:

step S1, randomly extracting samples from the target domain to form a target domain batch, and inputting the target domain batch into a feature extractor to obtain sample features; inputting the sample characteristics into a multi-classifier, and performing entropy minimization processing; simultaneously inputting the sample characteristics into a multi-two classifier, judging k critical samples and corresponding k pairs of similar classes according to the output of the multi-two classifier, and calculating the difference margin between the corresponding k positive and negative sample pairs;

s2, according to the k pairs of similar classes found in the target domain batch in the step S1, screening effective samples from the k pairs of similar class samples in the source domain through triple loss, and constructing the source domain batch;

step S3, inputting the constructed source domain batch into a feature extractor for feature extraction, respectively inputting the extracted features into a multi-classifier and a multi-two-classifier, and training the feature extractor, the multi-classifier and the multi-two-classifier based on classification information;

and step S4, inputting the features extracted by the source domain batch and the target domain batch into the domain countermeasure network respectively, and performing domain alignment operation.

Further, the target domain in the step S1The batch is sent into a feature extractor F to extract features and then sent into a multi-classifier C_mEntropy minimization was performed with the loss function as follows:

wherein, | X_tI represents the number of samples of the target domain batch, F is the feature extractor, C_mFor a multi-classifier, H (-) represents entropy calculation;

the features extracted by the target field batch are sent to a multi-classifier and a multi-two-classifier C_bIn, through C_bDetermining k critical samples; the specific determination method is as follows:

for target domain batch at C_bAt the output of (C), each sample in the target domain batch is at C_bDefining the difference between the maximum value and the second maximum value as a classification distance d, searching the first k samples with the minimum classification distance d, judging as critical samples, and recording the classes corresponding to the maximum value and the second maximum value of each critical sample as an A class and a B class respectively; computing a margin value β for similar classes A and B_(A,B)(ii) a Determining beta according to the minimum classification distance d corresponding to the critical point_(A,B)The following were used:

wherein

Representing the target domain batch, x_iRepresenting the critical samples found in the target field batch,

representing a critical sample x_iIn a multi-two classifier C_bOutput value of alpha₀Is an initial value, mu is a constant coefficient, d represents a critical point x_iThe corresponding minimum classification distance;

the method of averaging the margin values by different batchs is as follows:

two similar classes representing A and B are the average margin value at the tth batch.

Further, in the step S2, k pairs of similar classes are determined by the target domain batch, and a triple loss triplet loss is used to screen out valid samples in the k pairs of similar classes in the source domain to construct the source domain batch; the loss function is as follows:

(x_i,y_i),(x_j,y_j),(x_m,y_m)∈X_S,

y_i＝A,y_j＝A,y_m＝B.

wherein X_SRepresenting a source domain data set, x_iAnd x_jFor class A samples in the source domain dataset, x_mB type samples in a source domain data set, F a feature extractor and G a measurement learning device; at the same time, the characteristics obtained by dimension reduction need to be normalized, i.e. G (F (x)) Y²＝1；

Constructing the source domain batch includes the following steps:

l1, traversing k pairs of similarity classes, and for each pair of similarity classes, extracting the A-type samples in the similarity class in the source domainTwo samples are randomly taken as x_iAnd x_jA sample, fixing both samples simultaneously;

l2, traversing the B-type samples of the similar class in the source domain, and extracting a sample x_mWill (x)_i,x_j,x_m) The three samples are sent to a feature extractor and a measurement learning device in sequence, and a loss function L is calculated_G(ii) a When L is_GWhen less than 0, then continue traversing class B samples until L is satisfied_GA sample greater than or equal to 0 occurs; when L is_GWhen the value is greater than or equal to 0, the triplet (x) is added_i,x_j,x_m) Putting the source field in the batch of the source field; there is still no L after traversing class B samples_GWhen the triples are larger than 0, randomly extracting 3 samples from the source domain data set and putting the samples into the source domain batch;

step L3, for k pairs of similar classes in the source domain, adopting the screening sample to the source domain batch; the remaining samples in the source field batch are randomly drawn from the other class samples in the source field.

Further, the training of the feature extractor, the multi-classifier, and the multi-classifier based on the classification information in step S3 is specifically as follows:

according to the extracted source field batch, the batch is sent to a multi-classifier C after extracting features_mAnd a multi-two classifier C_bIn the method, the difference between the output value and the true value is used to optimize the loss function L^s _clsAnd L_sUntil the two loss functions converge, the multi-classifier C is considered_mAnd a multi-two classifier C_bFinishing the training; the loss function is specifically as follows:

wherein n is_sNumber of samples, y, representing source field batch_iRepresenting the source domain batchAuthentic tag of ith sample, y_i' prediction tag representing ith sample of source field batch, C_sNumber of classes, L, representing source domain_yTo cross entropy, L_bceIs binary cross entropy.

Further, in the step S4, the features extracted from the source domain batch and the target domain batch are respectively sent to the domain countermeasure network, and the domain alignment operation is performed, specifically as follows:

due to the existence of the gradient inversion layer, the loss value of the domain classifier D can be increased by the feature extraction part, the extracted features have the function of mixing up the D, and the difference between the feature distribution of the source domain and the feature distribution of the target domain is reduced; the loss function is as follows:

where D represents a domain classifier, θ_dAnd theta_fParameters, n, representing the domain classifier and feature extractor, respectively_tAnd n_sNumber of batch samples, L, representing target and source domains, respectively_bceRepresenting binary cross entropy, d_iReal field, d, representing the ith sample of the source field batch_jRepresenting the real field of the jth sample of the target field batch.

Has the advantages that: the domain self-adaptive method based on the triple and the difference measure provided by the invention has the following advantages:

1. the method effectively solves the problem that a domain self-adaptive method based on a countermeasure thought is too rough when used for image classification, and more finely divides samples in a feature space embedding space by utilizing triple loss aiming at similar samples in a target domain, so that the classification accuracy is improved.

2. In the selection process of the source domain batch, the triples are used for screening, so that the source domain batch is ensured to contain effective sample triples, the source domain similar samples can be effectively separated in the characteristic space by means of metric learning, and the distance between the target domain similar samples can be indirectly separated. In addition, the method has good innovation in the design of the margin of the triple loss triplet loss, the margin is calculated by calculating the reciprocal of the minimum classification distance of the critical point, the two types of samples with higher similarity are ensured, the distance between the samples can be effectively pulled apart, and the log function is added to prevent the margin from being too large, so that the triple loss is too large, and the training of the whole network is influenced.

3. In the unsupervised domain self-adaptive image classification process, the method effectively improves the characteristic mobility and greatly improves the generalization capability of the model.

4. The invention has the characteristic of simplicity. The model is simple in structure, intuitive in physical significance and low in calculation complexity.

Drawings

FIG. 1 is a flow chart of the training of the domain adaptive method based on the triplet and the deficit metric according to the present invention.

Detailed Description

The present invention will be further described with reference to the accompanying drawings.

A domain adaptation method based on triplet and deficit metrics as shown in fig. 1, comprising the steps of:

step S1, randomly extracting samples from the target domain to form a target domain batch, and inputting the target domain batch into a feature extractor to obtain sample features; inputting the sample characteristics into a multi-classifier, and performing entropy minimization processing; and simultaneously inputting the sample characteristics into a multi-two classifier, judging the similarity of k critical samples and corresponding k pairs according to the output of the multi-two classifier, and calculating the difference margin between the corresponding k positive and negative sample pairs.

Sending the target domain batch into a feature extractor F to extract features and then sending the target domain batch into a multi-channel extractor FClassifier C_mEntropy minimization was performed with the loss function as follows:

wherein, | X_tI represents the number of samples of the target domain batch, F is the feature extractor, C_mFor a multi-classifier, H (-) represents entropy calculation; the goal is to make the entropy of the target domain samples as small as possible during the training process.

Since the source domain and the target domain have the same class space, when both the source domain and the target domain have N classes, then the output of a sample of the source domain or the target domain through the duobinary classifier also has N values, the size of each value representing the likelihood that the sample belongs to the corresponding class.

And generating corresponding output for the target domain batch through a multi-two classifier, calculating the difference between the maximum value and the second maximum value of each sample in the batch sample output by the multi-two classifier, taking the k samples with the minimum difference as critical samples, namely, determining that the k samples of the target domain batch are close to a classification boundary, and simultaneously recording k pairs of similar classes corresponding to the k samples, namely, recording the maximum value and the second maximum value of each critical point output by the multi-two classifier. Since metric learning of these similar samples in the source domain requires the distance between each pair of similar samples to be pulled apart in the feature space embedding space, wherein for the loss function of the metric learner G, the present invention employs a triple loss, so the key step of the present invention is how to determine the difference between the positive and negative sample pairs in the loss, i.e. margin. The minimum classification distance corresponding to the critical sample, that is, the difference between the maximum value and the second maximum value output by the critical sample on the multi-two classifier, is used to calculate the reciprocal and then calculate the log, which is used as the margin of the triple loss. In particular, the amount of the solvent to be used,

for target domain batch at C_bAt the output of (C), each sample in the target domain batch is at C_bDefining the difference between the maximum value and the second maximum value as the classification distance d, and searching the first k samples with the minimum classification distance dJudging as a critical sample, and recording the class corresponding to the maximum value and the next maximum value of each critical sample as A class and B class respectively; computing a margin value β for similar classes A and B_(A,B)(ii) a Determining beta according to the minimum classification distance d corresponding to the critical point_(A,B)The following were used:

wherein

the method of averaging the margin values by different batchs is as follows:

And S2, according to the k pairs of similar classes found in the target domain batch in the step S1, screening effective samples from the k pairs of similar class samples in the source domain through triple loss, and constructing the source domain batch. Meanwhile, by means of the metric learning, the distance between each pair of similar samples in the k pairs of similar classes of the source domain can be pulled in the feature space embedding space.

The loss function is as follows:

(x_i,y_i),(x_j,y_j),(x_m,y_m)∈X_S,

y_i＝A,y_j＝A,y_m＝B.

Constructing the source domain batch includes the following steps:

l1, traversing k pairs of similarity classes, and randomly extracting two samples from the A class samples in the similarity classes in the source domain as x for each pair of similarity classes_iAnd x_jA sample, fixing both samples simultaneously;

Step S3, inputting the constructed source domain batch into a feature extractor for feature extraction, respectively inputting the extracted features into a multi-classifier and a multi-two-classifier, and optimizing the loss function L by using the difference between the output value and the true value^s _clsAnd L_sIn order to train these two classifiers, the loss function is specifically as follows:

wherein n is_sNumber of samples, y, representing source field batch_iReal tag, y, representing the ith sample of the source field batch_i' prediction tag representing ith sample of source field batch, C_sNumber of classes, L, representing source domain_yTo cross entropy, L_bceIs binary cross entropy.

The features extracted from the source domain batch and the target domain batch are sent into a domain countermeasure network for domain alignment, and due to the existence of a gradient inversion layer, the loss value of the domain classifier D can be increased by the feature extraction part, so that the extracted features have the effect of confusion D, and the difference between the feature distributions of the source domain and the target domain is reduced. The loss function is as follows:

where D represents a domain classifier, θ_dAnd theta_fParameters, n, representing the domain classifier and feature extractor, respectively_tAnd n_sNumber of batch samples, L, representing target and source domains, respectively_bceRepresenting binary cross entropy, d_iThe real field, which represents the ith sample of the source field batch, is usually taken to be 1. d_jThe real field, representing the jth sample of the target field batch, is typically taken to be 0.

The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims

1. A domain adaptation method based on triplet and deficit metrics, comprising the steps of:

s2, according to the k pairs of similar classes found in the target domain batch in the step S1, screening effective samples in the k pairs of similar classes of the source domain through triple loss, and constructing the source domain batch;

2. The method of claim 1, wherein the target domain batch in step S1 is sent to the feature extractor F for feature extraction and then sent to the multi-classifier C for feature extraction_mEntropy minimization was performed with the loss function as follows:

wherein

the method of averaging the margin values by different batchs is as follows:

3. The method of claim 1, wherein k pairs of similar classes are determined by the target domain batch in step S2, and the tripleltlos are used to screen out valid samples in the k pairs of similar classes in the source domain to construct the source domain batch; the loss function is as follows:

(x_i,y_i),(x_j,y_j),(x_m,y_m)∈X_S,

y_i＝A,y_j＝A,y_m＝B.

Constructing the source domain batch includes the following steps:

4. The method of claim 1, wherein the training of the feature extractor, the multi-classifier and the multi-classifier based on the classification information in step S3 is as follows:

according to the extracted source domain batch, the batch is sent to the extraction device after the characteristics are extractedMultiple input classifier C_mAnd a multi-two classifier C_bIn the method, the difference between the output value and the true value is used to optimize the loss function L^s _clsAnd L_sUntil the loss function converges, at which time the multiple classifiers C_mAnd a multi-two classifier C_bFinishing the training; the loss function is specifically as follows:

wherein n is_sNumber of samples, y, representing source field batch_iReal tag, y 'representing ith sample of source domain batch'_iPrediction tag, C, representing the ith sample of the source field batch_sNumber of classes, L, representing source domain_yTo cross entropy, L_bceIs binary cross entropy.

5. The method of claim 1, wherein the features extracted from the source domain batch and the target domain batch in step S4 are respectively sent to a domain countermeasure network for domain alignment, specifically as follows: