CN108256561B - Multi-source domain adaptive migration method and system based on counterstudy - Google Patents

Multi-source domain adaptive migration method and system based on counterstudy Download PDF

Info

Publication number
CN108256561B
CN108256561B CN201711468680.XA CN201711468680A CN108256561B CN 108256561 B CN108256561 B CN 108256561B CN 201711468680 A CN201711468680 A CN 201711468680A CN 108256561 B CN108256561 B CN 108256561B
Authority
CN
China
Prior art keywords
domain
target
source domain
path
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711468680.XA
Other languages
Chinese (zh)
Other versions
CN108256561A (en
Inventor
林倞
陈子良
王可泽
许瑞嘉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN201711468680.XA priority Critical patent/CN108256561B/en
Publication of CN108256561A publication Critical patent/CN108256561A/en
Application granted granted Critical
Publication of CN108256561B publication Critical patent/CN108256561B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade

Abstract

The invention discloses a multi-source domain adaptive migration method and system based on antagonistic learning, wherein the method comprises the following steps: the method comprises the following steps that firstly, source domain data are used for pre-training, and a representation network and a classifier of a target model are initialized; step two, multi-path countermeasure is carried out by using the multi-source domain data and the target domain data, and the representation network and the multi-path discriminator of the target model are updated; step three, calculating the confrontation score between each source domain and each target domain; classifying the target domain based on the classifier and the confrontation score of each source domain; selecting a high-confidence target domain pseudo sample to finely adjust a representation network and a classifier of the target model; and step six, returning to the step two, performing the step two-five, and stopping training until the model converges or the maximum iteration times is reached.

Description

Multi-source domain adaptive migration method and system based on counterstudy
Technical Field
The invention relates to the technical field of machine learning, in particular to a multi-source domain adaptive migration method and system based on countervailing learning.
Background
With the continuous generation of large-scale data and the difficulty of information labeling depending on manpower, the domain adaptive migration method gradually becomes a very important research topic in the field of machine learning. The domain adaptive learning aims at adapting to the feature distribution among different domain data, improving the performance of classifiers after migration among different domains and solving the problem that target domain data is lack of labeling information. The domain adaptive migration method is also a key technical means in the industry, and has important application in various fields such as face recognition, automatic driving, medical imaging and the like.
Currently, most domain-adaptive learning methods focus mainly on the migration process of a single source domain and rely on the assumption that a single source domain label set is consistent with a target domain. Yaroslav Ganin et al discloses a single-source Domain adaptation method for image classification in a document of 'Domain-adaptive tracking of Neural Networks' (Journal of Machine learning research,2016,17(59):1-35), which performs counterlearning on feature distribution of source Domain images and target Domain images by introducing an inter-Domain classifier, obtains a Domain-independent feature representation, and improves classification performance of the target Domain images after migration. However, this kind of method lacks versatility in real-world scenarios and cannot handle the situation where the source domain data tag space is inconsistent with the target domain.
In addition, Hongfu Liu et al propose a method for maintaining the overall Structure of Multi-source domain Data In the document "Structure-Preserved Multi-source Domain address" (In IEEE 16th International Conference on Data Mining (ICDM), pages 1059-1064. IEEE,2016) to perform migration of target tasks, but this method often ignores the difference between Data In different fields, and cannot avoid the negative migration phenomenon In Multi-source domain adaptation.
Disclosure of Invention
In order to overcome the defects in the prior art, the present invention provides a multi-source domain adaptive migration method and system based on counterlearning, so as to generalize the existing single-source domain adaptive process based on counterlearning to multi-source domain adaptation, no longer rely on the assumption that a single source domain label set is consistent with a target domain, and effectively avoid the negative migration phenomenon in the multi-source domain adaptive process.
To achieve the above and other objects, the present invention provides a multi-source domain adaptive migration method based on counterstudy, comprising the following steps:
the method comprises the following steps that firstly, source domain data are used for pre-training, and a representation network and a classifier of a target model are initialized;
step two, multi-path countermeasure is carried out by using the multi-source domain data and the target domain data, and the representation network and the multi-path discriminator of the target model are updated;
step three, calculating the confrontation score between each source domain and each target domain;
classifying the target domain based on the classifier and the confrontation score of each source domain;
selecting a high-confidence target domain pseudo sample to finely adjust a representation network and a classifier of the target model;
and step six, returning to the step two, and performing the step two-five until the model converges or the maximum iteration times is reached, and stopping training.
Further, the first step further comprises:
inputting N marked source domain data sets and inputting an unmarked target domain data set;
the domain-independent representation network F and the domain-dependent multi-way classifier C are pre-trained for the target model using all of the source domain datasets.
Further, the pre-training of the target model for the domain-independent representation network F and the domain-dependent multi-way classifier C using all the source domain datasets is specifically based on optimizing the target as follows:
Figure BDA0001531599170000031
updating parameters representing the network and the multi-way classifier in the object model, wherein
Figure BDA0001531599170000032
A loss function representing a multi-way classification,
Figure BDA0001531599170000033
indicating the type of loss function that is specifically chosen,
Figure BDA0001531599170000034
denotes the s thjA way classifier, E represents the expectation of all sample loss values, and F (x) represents the feature coding of the image x after the image x passes through the representation network F.
Further, the second step further comprises:
performing feature extraction on the images of the multi-source domain and the target domain by using a representation network;
and (3) respectively forming a pair of each source domain and each target domain, inputting the pair into a multi-path discriminator network D for judgment training, and updating the representation network and the multi-path discriminator of the target model.
Further, the update strategy of the multi-way discriminator network D is to distinguish whether the input features are from the source domain or the target domain as much as possible, indicating that the update strategy of the network is to confuse the features as much as possible, so that the discriminator network cannot distinguish whether the input features are from the source domain or the target domain.
Further, in step two, the multipath arbiter and the loss function representing the network are updated and optimized using the least squares representation thereof.
Further, in step three, the loss value of each path of the arbiter is accumulated as the confrontation score of the corresponding source domain and target domain.
Further, in step four, the samples of the target domain are classified according to the confrontation score obtained in step three and the representation network F and the multi-path classifier C of the target model, and are endowed with pseudo labels.
Further, in the fifth step, a sample with the confidence level larger than a set threshold value is selected on the basis of the fourth step to form a target domain pseudo sample set, and a multi-path classifier of the target model is finely adjusted to obtain a more effective and separable feature code on the target domain.
In order to achieve the above object, the present invention further provides a multi-source domain adaptive migration system based on countermeasure learning, including:
the pre-training unit is used for pre-training by using each source domain data and initializing a representation network and a classifier of the target model;
the multi-path countermeasure unit is used for performing multi-path countermeasure by using the multi-source domain data and the target domain data and updating the representation network and the multi-path discriminator of the target model;
a confrontation score calculation unit for calculating a confrontation score between each source domain and the target domain;
the classification unit is used for classifying the target domain based on the classifier and the confrontation score of each source domain;
and the fine tuning unit is used for selecting a high-confidence-degree target domain pseudo sample to fine tune the representation network and the classifier of the target model, returning to the multi-path countermeasure unit for training, and stopping training until the model converges or the maximum iteration number is reached.
Compared with the prior art, the method and the device have the advantages that the existing single-source domain adaptation process is popularized to multi-source domain adaptation, the assumption that a single source domain label set is consistent with a target domain is not relied on, and the universality is stronger in a real scene. In addition, the invention adapts the characteristics among different domains based on the counterstudy, thereby effectively avoiding the generation of the negative migration phenomenon and obviously improving the classification performance after the domain adaptation.
Drawings
FIG. 1 is a flowchart illustrating steps of a multi-source domain adaptive migration method based on counterlearning according to the present invention.
Fig. 2 is a flowchart of a multi-source domain adaptive migration method based on countermeasure learning, which takes two source domains as an example, according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a network framework in which two source domains are taken as an example according to an embodiment of the present invention.
FIG. 4 is a diagram illustrating the visualization effect of two source domains (A, D) migrating to a target domain (W) before and after domain adaptation in an exemplary embodiment of the invention.
FIG. 5 is a system architecture diagram of a multi-source domain adaptive migration system based on counterstudy according to the present invention.
Detailed Description
Other advantages and capabilities of the present invention will be readily apparent to those skilled in the art from the present disclosure by describing the embodiments of the present invention with specific embodiments thereof in conjunction with the accompanying drawings. The invention is capable of other and different embodiments and its several details are capable of modification in various other respects, all without departing from the spirit and scope of the present invention.
Fig. 1 is a flowchart illustrating steps of a multi-source domain adaptive migration method based on counterlearning according to an embodiment of the present invention, and fig. 2 is a flowchart illustrating a multi-source domain adaptive migration method based on counterlearning according to an embodiment of the present invention. As shown in fig. 1 and fig. 2, the multi-source domain adaptive migration method based on counterstudy of the present invention includes the following steps:
step 101, using each source domain data to pre-train and initialize a representation network and a classifier of the target model.
Specifically, step 101 further includes:
step S100, inputting N labeled source domain data sets, the distribution of which is expressed as
Figure BDA0001531599170000062
Wherein s isjDenotes the jth source field, and x and y denote the sample image and the corresponding label, respectively. Assume data collections for source domains
Figure BDA0001531599170000063
The samples are taken from different distributions, among which
Figure BDA0001531599170000064
And
Figure BDA0001531599170000065
respectively representing data from a source domain sjAnd corresponding label, while inputting a label-free target domain data set, whose distribution is denoted as pt(x, y) corresponding image sets are denoted as
Figure BDA0001531599170000066
In the embodiment of the present invention, taking two source domains as an example, i.e., inputting the images of the source domains S1 and S2 and the corresponding tags, the image of the target domain T is input;
step S101, pre-training a target model for the domain-independent representation network F and the domain-dependent multi-way classifier C by using all the source domain data sets, namely updating parameters representing the network F and the multi-way classifier C in the target model according to the following optimization targets:
Figure BDA0001531599170000061
wherein
Figure BDA0001531599170000067
Represents a loss function of the multi-path classification, and
Figure BDA0001531599170000068
indicating the type of loss function that is specifically chosen,
Figure BDA0001531599170000069
denotes the s thjA way classifier, E represents the expectation of all sample loss values, and F (x) represents the feature coding of the image x after the image x passes through the representation network F.
In a specific embodiment of the present invention, the sum of the intersections of the tag sets of the multi-source domain data is equal to the tag set of the target domain, i.e. the tag set of the target domain
Figure BDA0001531599170000073
And 102, performing multi-path countermeasure by using the multi-source domain data and the target domain data, and updating the representation network and the multi-path discriminator of the target model. Specifically, the parameters of the current multi-path classifier C are fixed, the target domain image data is introduced for multi-path countermeasure, and the step 102 further includes:
step S200, using a representation network F to perform feature extraction on images of a multi-source domain and a target domain, and obtaining feature representations of the source domain S1, S2 and the target domain T in the specific embodiment of the invention;
step S201, every source domain SjAnd the target domain T respectively form a pair, such as S1 and T, and S2 and T, and the pair is input into a multi-path arbiter network D for decision training, and a representation network of the target model and the multi-path arbiter are updated. In a specific embodiment of the present invention, the update strategy of the multi-way discriminator network D is to distinguish as much as possible whether the input features come from the source domain or the target domain; the update strategy representing the network is to confuse the features as much as possible so that the discriminator network cannot distinguish whether the input features are from the source domain or the target domain. This antagonistic process is formulated as follows:
Figure BDA0001531599170000071
wherein the classification loss function
Figure BDA0001531599170000074
As shown in equation (1) (but the parameters of classifier C are not updated), but the penalty function is combated
Figure BDA0001531599170000075
Expressed as:
Figure BDA0001531599170000072
wherein
Figure BDA0001531599170000076
Denotes the s thjAnd a path discriminator, E represents the expectation of the corresponding loss value, and F (x) represents the feature coding of the image x after passing through the representation network F.
Preferably, in step S201, the multipath countermeasure process uses the gradient of the returned difficulty samples to update the representation network F of the target model. In particular, in all source domains
Figure BDA0001531599170000083
In selection
Figure BDA0001531599170000084
So that
Figure BDA0001531599170000081
And back to the source domain
Figure BDA0001531599170000085
The competing loss updates to the target domain represent the network, where M is the number of samples in the current iteration.
Preferably, in order to stabilize the training process of the countermeasure, the above-mentioned step 102 of updating the multi-way discriminator and the loss function representing the network is optimized using its least squares representation, i.e. using the following functions:
Figure BDA0001531599170000086
optimizing a multipath discriminator, use
Figure BDA0001531599170000087
The optimization represents the network.
Step 103, calculating the confrontation score between each source domain and the target domain. In an embodiment of the present invention, the loss value of each way discriminator is accumulated as the confrontation score (characterizing the inter-domain similarity) of the corresponding source domain and target domain.
Step 104, classifying the target domain based on the classifier and the confrontation score of each source domain.
Specifically, the samples of the target domain are classified and given pseudo labels according to the confrontation score obtained in step 103 and the representation network F and the multi-way classifier C of the target model. In particular, for the ith sample in the target domain
Figure BDA0001531599170000088
The confidence that the object model labeled it as a class c label is
Figure BDA0001531599170000082
Wherein
Figure BDA0001531599170000089
Denotes the s thjWay classifier will sample
Figure BDA00015315991700000810
The probability of being classified as a class c label,
Figure BDA0001531599170000091
representing a target domain and a source domain skThe resulting challenge score is calculated by step 103,
Figure BDA0001531599170000092
indicating that class c label belongs to source domain sjS of time correspondencejThe way classifier will participate in calculating the confidence of the class label.
Intuitively speaking, the target model extracts the features of the image through the representation network F, classifies the features by using the multi-path classifier, takes the countervailing score as the weight to carry out weighted average on the classification result, and the larger the countervailing score is, the closer the corresponding source domain and the target domain are, the more reliable the classification result of the path classifier is.
And 105, selecting a high-confidence target domain pseudo sample to fine tune a representation network and a classifier of the target model.
In the embodiment of the invention, the samples with the confidence level greater than the set threshold are selected to form the target domain pseudo sample set based on the step 104
Figure BDA0001531599170000093
And fine-tuning a multi-path classifier of the target model to obtain more effectively separable feature codes on a target domain. Specifically, based on the optimization objective:
Figure BDA0001531599170000094
the representation network F of the object model and the multi-way classifier C are updated,
Figure BDA0001531599170000095
representing a source domain
Figure BDA0001531599170000096
The set of labels of (a) contains pseudo labels
Figure BDA0001531599170000097
Then to the corresponding second
Figure BDA0001531599170000098
The way classifier is updated.
Step 106, returning to step 102, proceeding to step 102-105, and stopping training until the model converges or reaches the maximum iteration number.
The invention will be illustrated by the following specific examples in conjunction with fig. 2: in the embodiment of the invention, two source domains are taken as an example, an open source deep learning frame Pythrch and a visualization tool t-SNE in an open source machine learning library Scikit-lean are called, and the specific process is as follows:
(1) feature extraction of source domain and target domain images (fig. 3 left dotted frame)
N (here, N is 2) labeled source domain data sets (corresponding to a and D in fig. 4, respectively) are input, and the distribution thereof is represented as
Figure BDA0001531599170000101
Wherein s isjDenotes the jth source field, and x and y denote the sample image and the corresponding label, respectively. Assume data collections for source domains
Figure BDA0001531599170000102
The samples are taken from different distributions, among which
Figure BDA0001531599170000103
And
Figure BDA0001531599170000104
respectively representing data from a source domain sjAnd a corresponding label. At the same time, an unmarked target domain data set (corresponding to W in FIG. 4) is input, the distribution of which is denoted as pt(x, y) corresponding image sets are denoted as
Figure BDA0001531599170000105
In each iteration, the same number of training samples are randomly sampled in each source domain and each target domain, and feature representation is performed through a representation network F shared by parameters.
(2) Multi-way countermeasure of source domain and target domain image features (dashed box in FIG. 3)
Based on the extracted image features, each source domain sjAnd the target domain t respectively form a pair, and the pair is input into a multi-path discriminator network D for judgment. The update strategy of the discriminator network is to distinguish as much as possible whether the input features come from the source domain or the target domain; the update strategy representing the network is to confuse the features as much as possible so that the discriminator network cannot distinguish whether the input features are from the source domain or the target domain.
Because the problem of gradient diffusion is easy to generate in the training process of the counterstudy, in order to overcome the problem, the multipath discriminator is updated and the loss of the network is representedThe function will be optimized using its least squares representation, i.e. using
Figure BDA0001531599170000106
Optimizing a multipath discriminator, use
Figure BDA0001531599170000107
An optimized representation network, wherein
Figure BDA0001531599170000108
Denotes the s thjA path discriminator.
Due to the unfavorable phenomenon of negative migration in multi-source domain adaptive learning, the gradient of the difficult sample is returned in the multi-path countermeasure process to be used for updating the representation network of the target model. In particular, in all source domains
Figure BDA0001531599170000109
In selection
Figure BDA00015315991700001010
So that
Figure BDA00015315991700001011
And back to the source domain
Figure BDA0001531599170000111
The competing loss updates to the target domain represent the network, where M is the number of samples in the current iteration.
Meanwhile, the loss value of each path of the discriminator is accumulated to be used as the confrontation score of the corresponding source domain and the target domain to represent the inter-domain similarity. The larger the loss value of the discriminator is, the more confusing and closer the characteristics of the corresponding source domain and the target domain are.
(3) Multi-way classification of target domain samples (FIG. 3 right dashed box)
Classifying the samples of the target domain according to the obtained countermeasure score in the step (2) and the representation network F and the multi-path classifier C of the target model, and giving a pseudo label. In particular, for the ith sample in the target domain
Figure BDA0001531599170000112
The confidence that the object model labeled it as a class c label is
Figure BDA0001531599170000113
Wherein
Figure BDA0001531599170000114
Represents the sj path classifier to sample
Figure BDA0001531599170000115
The probability of being classified as a class c label,
Figure BDA0001531599170000116
representing the confrontation score calculated by the target domain and the source domain sk in the multi-path confrontation process,
Figure BDA0001531599170000117
the corresponding sj-th path classifier participates in calculating the confidence of the class label when the class c label belongs to the source domain sj. Intuitively speaking, the target model extracts the features of the image through the representation network F, classifies the features by using the multi-path classifier, takes the countervailing score as the weight to carry out weighted average on the classification result, and the larger the countervailing score is, the closer the corresponding source domain and the target domain are, the more reliable the classification result of the path classifier is. On the basis, selecting samples with confidence degrees larger than a set threshold value to form a target domain pseudo sample set
Figure BDA0001531599170000118
And fine-tuning a multi-path classifier of the target model to obtain more effectively separable feature codes on a target domain.
FIG. 4 illustrates the visual effect of two source domains (A, D) migrating to a target domain (W) before and after domain adaptation, with different icon shapes representing different categories. For the sake of visual display, we show the features of the two source domains and the target domain pair by pair. It can be easily found from fig. 4(3) comparing fig. 4(1), fig. 4(4) comparing fig. 4(2), that after the multi-source domain adaptive migration method of the present invention is used, the class intervals of different classes are enlarged, the separability is stronger, and the improvement of the classification accuracy of the target domain image is facilitated. Meanwhile, comparing fig. 4(4) with fig. 4(3), it can be shown that the domain adaptation effect of D → W is better than that of a → W, and this is consistent with the magnitude of the confrontation score, which indicates that the method of the present invention can distinguish the differences between different domains and avoid the unfavorable phenomenon of negative migration during the inter-domain adaptation process.
FIG. 5 is a system architecture diagram of a multi-source domain adaptive migration system based on counterstudy according to the present invention. As shown in fig. 5, the multi-source domain adaptive migration system based on counterstudy of the present invention includes:
and a pre-training unit 501, configured to perform pre-training using each source domain data and initialize a representation network and a classifier of the target model.
Specifically, the pre-training unit 501 further includes:
an input module for inputting the labeled N source domain data sets, the distribution of which is expressed as
Figure BDA0001531599170000121
Wherein s isjDenotes the jth source field, and x and y denote the sample image and the corresponding label, respectively. Assume data collections for source domains
Figure BDA0001531599170000122
The samples are taken from different distributions, among which
Figure BDA0001531599170000123
And
Figure BDA0001531599170000124
respectively representing data from a source domain sjAnd the corresponding label, while the input unit also inputs a label-free target field dataset, whose distribution is denoted pt(x, y) corresponding image sets are denoted as
Figure BDA0001531599170000125
In the embodiment of the present invention, two source fields are taken as an example, i.e., the images and corresponding labels of the input source fields S1 and S2Entering an image of a target domain T;
a pre-training module for pre-training the object model for the domain-independent representation network F and the domain-dependent multi-way classifier C using all source domain datasets, i.e. according to the optimization object
Figure BDA0001531599170000131
Updating parameters representing the network and the multi-way classifier in the object model, wherein
Figure BDA0001531599170000132
Indicating the type of loss function selected for the loss,
Figure BDA0001531599170000133
denotes the s thjA way classifier.
In a specific embodiment of the present invention, the sum of the intersections of the tag sets of the multi-source domain data is equal to the tag set of the target domain, i.e. the tag set of the target domain
Figure BDA0001531599170000134
And the multi-path countermeasure unit 502 is used for performing multi-path countermeasure by using the multi-source domain data and the target domain data and updating the representation network and the multi-path discriminator of the target model. Specifically, the multi-path countermeasure unit 502 fixes the parameters of the current multi-path classifier C, introduces the target domain image data for multi-path countermeasure, and the multi-path countermeasure unit 502 further includes:
the feature extraction module is used for extracting features of images of the multi-source domain and the target domain by using the representation network F, and in the specific embodiment of the invention, feature representations of the source domain S1, S2 and the target domain T are obtained;
a training update module for updating each source field sjAnd the target domain T respectively form a pair, such as S1 and T, and S2 and T, and the pair is input into a multi-path arbiter network D for decision training, and a representation network of the target model and the multi-path arbiter are updated. In a specific embodiment of the present invention, the update strategy of the multi-way discriminator network D is to distinguish as much as possible whether the input features come from the source domain or the target domain; while the update strategy for the presentation network is as mixed as possibleThe features are obfuscated such that the discriminator network cannot distinguish whether the input feature is from the source domain or the target domain.
Preferably, in the multi-path countermeasure unit 502, the multi-path countermeasure process uses the gradient of the backtransmission difficulty sample for updating the representation network of the target model.
Preferably, in order to stabilize the training process of the countermeasure, the update multi-way discriminator and the loss function representing the network in the multi-way countermeasure unit 502 are optimized using the least squares representation thereof.
A confrontation score calculating unit 503 for calculating the confrontation score between each source domain and the target domain. In an embodiment of the present invention, the confrontation score calculating unit 503 accumulates the loss value of each way of the discriminator as the confrontation score (representing the inter-domain similarity) of the corresponding source domain and target domain.
A classification unit 504, configured to classify the target domain based on the classifier and the confrontation score of each source domain.
Specifically, the samples of the target domain are classified and given pseudo labels according to the confrontation score obtained by the confrontation score calculation unit 503 and the representation network F and the multi-path classifier C of the target model.
Intuitively speaking, the target model extracts the features of the image through the representation network F, classifies the features by using the multi-path classifier, takes the countervailing score as the weight to carry out weighted average on the classification result, and the larger the countervailing score is, the closer the corresponding source domain and the target domain are, the more reliable the classification result of the path classifier is.
And the fine-tuning unit 505 is configured to select a high-confidence target domain pseudo sample to fine-tune the representation network and the classifier of the target model, and return to the multi-path countermeasure unit 502 for training until the model converges or the maximum iteration number is reached.
In an embodiment of the present invention, the fine-tuning unit 505 selects, on the basis of the classification unit 504, samples with confidence levels greater than a set threshold to form a target domain pseudo sample set
Figure BDA0001531599170000141
And to a multi-way classifier of the object modelFine tuning to obtain more efficiently separable signature coding over the target domain.
Therefore, the method and the device can be used for popularizing the existing single-source domain adaptation process to multi-source domain adaptation, so that the single-source domain adaptation process does not rely on the assumption that a single source domain label set is consistent with a target domain, and the method and the device have stronger universality in a real scene. In addition, the invention adapts the characteristics among different domains based on the counterstudy, thereby effectively avoiding the generation of the negative migration phenomenon and obviously improving the classification performance after the domain adaptation.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Modifications and variations can be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the present invention. Therefore, the scope of the invention should be determined from the following claims.

Claims (8)

1. A multi-source domain adaptive migration method based on antagonistic learning comprises the following steps:
the method comprises the steps of firstly, obtaining marked source domain data and unmarked target domain data of a plurality of source domains, and using the source domain data of each source domain to pre-train and initialize a representation network and a multi-path classifier of a target model, wherein each source domain data comprises image data and a corresponding label, and the target domain data comprises image data; the first step also comprises the following steps of,
inputting labeled N source domain data sets and inputting unlabeled target domain data sets,
pre-training a domain-independent representation network F and a domain-dependent multi-way classifier C for a target model using all source domain datasets, the pre-training step being specifically based on optimizing the target as follows
Figure FDA0002461077390000011
Updating parameters representing the network F and the multi-way classifier C in the object model, wherein
Figure FDA0002461077390000012
A loss function representing a multi-way classification,
Figure FDA0002461077390000013
indicating the type of loss function that is specifically chosen,
Figure FDA0002461077390000014
representing an sj path classifier, E representing expectation of loss values of all samples, F (x) representing feature coding of an image x after the image x passes through a representation network F;
fixing parameters of a current multi-path classifier, introducing target domain data, performing multi-path countermeasure by using the source domain data and the target domain data of the source domains, and updating a representation network and a multi-path discriminator of the target model;
calculating a countermeasure score between the corresponding source domain and the target domain based on the loss value of each path of the discriminator;
classifying the samples of the target domain based on the multi-path classifier and the confrontation score of each source domain, and giving a pseudo label;
selecting a target domain pseudo sample with high confidence level to finely adjust the representation network and the multi-path classifier of the target model, and acquiring more effective and separable feature codes on the target domain;
and step six, returning to the step two, and performing the step two-five until the model converges or the maximum iteration times is reached, and stopping training.
2. The multi-source domain adaptive migration method based on antagonistic learning of claim 1, wherein step two further comprises:
performing feature extraction on the images of the multi-source domain and the target domain by using a representation network;
and (3) respectively forming a pair of each source domain and each target domain, inputting the pair into a multi-path discriminator network D for judgment training, and updating the representation network and the multi-path discriminator of the target model.
3. The multi-source domain adaptive migration method based on antagonistic learning according to claim 2, characterized in that: the updating strategy of the multi-path discriminator network D is to distinguish whether the input features come from the source domain or the target domain as much as possible, and the updating strategy of the network represents that the updating strategy of the network is to confuse the features as much as possible, so that the discriminator network can not distinguish whether the input features come from the source domain or the target domain.
4. The multi-source domain adaptive migration method based on antagonistic learning according to claim 3, characterized in that: in step two, the multipath arbiter and the loss function representing the network are updated and optimized using the least squares representation thereof.
5. The multi-source domain adaptive migration method based on antagonistic learning of claim 4, characterized in that: in step three, the loss value of each path of the discriminator is accumulated as the confrontation score of the corresponding source domain and target domain.
6. The multi-source domain adaptive migration method based on antagonistic learning, according to claim 1, is characterized in that: in the fourth step, the samples of the target domain are classified according to the confrontation scores obtained in the third step and the representation network F and the multi-path classifier C of the target model, and are endowed with pseudo labels.
7. The multi-source domain adaptive migration method based on antagonistic learning, according to claim 1, is characterized in that: in the fifth step, samples with the confidence degrees larger than the set threshold value are selected on the basis of the fourth step to form a target domain pseudo sample set, and a multi-path classifier of the target model is subjected to fine adjustment so as to obtain more effective and separable feature codes on the target domain.
8. A multi-source domain adaptive migration system based on counterstudy, comprising:
the system comprises a pre-training unit, a multi-path classifier and a plurality of source domains, wherein the pre-training unit is used for acquiring labeled source domain data and unmarked target domain data of a plurality of source domains, pre-training and initializing a representation network and the multi-path classifier of a chemical target model by using the source domain data of each source domain, each source domain data comprises image data and a corresponding label, the target domain data comprises the image data, and the pre-training step is specifically to optimize a target according to the following steps:
Figure FDA0002461077390000031
updating parameters representing the network F and the multi-way classifier C in the object model, wherein
Figure FDA0002461077390000032
A loss function representing a multi-way classification,
Figure FDA0002461077390000033
indicating the type of loss function that is specifically chosen,
Figure FDA0002461077390000034
denotes the s thjA way classifier, E represents the expectation of all sample loss values, F (x) represents the feature coding of the image x after passing through the representation network F
The multi-path countermeasure unit is used for introducing target domain data by fixing the parameters of the current multi-path classifier, performing multi-path countermeasure by using the source domain data of the source domains and the target domain data, and updating the representation network and the multi-path discriminator of the target model;
a countermeasure score calculation unit for calculating a countermeasure score between the corresponding source domain and target domain based on the loss value of each way of the discriminator;
the classification unit is used for classifying the samples of the target domain based on the multi-path classifier and the confrontation score of each source domain and endowing pseudo labels;
and the fine tuning unit is used for selecting a high-confidence-degree target domain pseudo sample to fine tune the representation network and the multi-path classifier of the target model, acquiring more effective and separable feature codes on the target domain, returning to the multi-path countermeasure unit for training, and stopping training until the model converges or the maximum iteration number is reached.
CN201711468680.XA 2017-12-29 2017-12-29 Multi-source domain adaptive migration method and system based on counterstudy Active CN108256561B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711468680.XA CN108256561B (en) 2017-12-29 2017-12-29 Multi-source domain adaptive migration method and system based on counterstudy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711468680.XA CN108256561B (en) 2017-12-29 2017-12-29 Multi-source domain adaptive migration method and system based on counterstudy

Publications (2)

Publication Number Publication Date
CN108256561A CN108256561A (en) 2018-07-06
CN108256561B true CN108256561B (en) 2020-06-16

Family

ID=62724910

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711468680.XA Active CN108256561B (en) 2017-12-29 2017-12-29 Multi-source domain adaptive migration method and system based on counterstudy

Country Status (1)

Country Link
CN (1) CN108256561B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11875270B2 (en) 2020-12-08 2024-01-16 International Business Machines Corporation Adversarial semi-supervised one-shot learning

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710636B (en) * 2018-11-13 2022-10-21 广东工业大学 Unsupervised industrial system anomaly detection method based on deep transfer learning
CN109523018B (en) * 2019-01-08 2022-10-18 重庆邮电大学 Image classification method based on deep migration learning
CN109948648B (en) * 2019-01-31 2023-04-07 中山大学 Multi-target domain adaptive migration method and system based on meta-countermeasure learning
CN110569985A (en) * 2019-03-09 2019-12-13 华南理工大学 Online heterogeneous transfer learning method based on online and offline decision-making integrated learning
CN110348579B (en) * 2019-05-28 2023-08-29 北京理工大学 Domain self-adaptive migration feature method and system
CN110188829B (en) * 2019-05-31 2022-01-28 北京市商汤科技开发有限公司 Neural network training method, target recognition method and related products
US11113829B2 (en) * 2019-08-20 2021-09-07 GM Global Technology Operations LLC Domain adaptation for analysis of images
CN110674849B (en) * 2019-09-02 2021-06-18 昆明理工大学 Cross-domain emotion classification method based on multi-source domain integrated migration
CN110807194A (en) * 2019-10-17 2020-02-18 新华三信息安全技术有限公司 Webshell detection method and device
CN111523680B (en) * 2019-12-23 2023-05-12 中山大学 Domain adaptation method based on Fredholm learning and countermeasure learning
CN111209935B (en) * 2019-12-26 2022-03-25 武汉安视感知科技有限公司 Unsupervised target detection method and system based on self-adaptive domain transfer
CN111161239B (en) * 2019-12-27 2024-02-27 上海联影智能医疗科技有限公司 Medical image analysis method, device, storage medium and computer equipment
CN111275092B (en) * 2020-01-17 2022-05-13 电子科技大学 Image classification method based on unsupervised domain adaptation
CN111340819B (en) * 2020-02-10 2023-09-12 腾讯科技(深圳)有限公司 Image segmentation method, device and storage medium
CN111310852B (en) * 2020-03-08 2022-08-12 桂林电子科技大学 Image classification method and system
CN111444952B (en) * 2020-03-24 2024-02-20 腾讯科技(深圳)有限公司 Sample recognition model generation method, device, computer equipment and storage medium
CN111444951B (en) * 2020-03-24 2024-02-20 腾讯科技(深圳)有限公司 Sample recognition model generation method, device, computer equipment and storage medium
CN111382568B (en) * 2020-05-29 2020-09-11 腾讯科技(深圳)有限公司 Training method and device of word segmentation model, storage medium and electronic equipment
CN111723691B (en) * 2020-06-03 2023-10-17 合肥的卢深视科技有限公司 Three-dimensional face recognition method and device, electronic equipment and storage medium
CN111610768B (en) * 2020-06-10 2021-03-19 中国矿业大学 Intermittent process quality prediction method based on similarity multi-source domain transfer learning strategy
CN111950608B (en) * 2020-06-12 2021-05-04 中国科学院大学 Domain self-adaptive object detection method based on contrast loss
CN111882055B (en) * 2020-06-15 2022-08-05 电子科技大学 Method for constructing target detection self-adaptive model based on cycleGAN and pseudo label
CN111860677B (en) * 2020-07-29 2023-11-21 湖南科技大学 Rolling bearing migration learning fault diagnosis method based on partial domain countermeasure
CN112215405B (en) * 2020-09-23 2024-04-16 国网甘肃省电力公司电力科学研究院 Non-invasive resident electricity load decomposition method based on DANN domain adaptive learning
CN112766334B (en) * 2021-01-08 2022-06-21 厦门大学 Cross-domain image classification method based on pseudo label domain adaptation
CN112906857B (en) * 2021-01-21 2024-03-19 商汤国际私人有限公司 Network training method and device, electronic equipment and storage medium
CN112836795B (en) * 2021-01-27 2023-08-18 西安理工大学 Multi-source unbalanced domain self-adaption method
CN112990387B (en) * 2021-05-17 2021-07-20 腾讯科技(深圳)有限公司 Model optimization method, related device and storage medium
CN113468323B (en) * 2021-06-01 2023-07-18 成都数之联科技股份有限公司 Dispute focus category and similarity judging method, system and device and recommending method
CN113486827B (en) * 2021-07-13 2023-12-08 上海中科辰新卫星技术有限公司 Multi-source remote sensing image migration learning method based on domain countermeasure and self supervision
CN113762466B (en) * 2021-08-02 2023-06-20 国网河南省电力公司信息通信公司 Electric power internet of things flow classification method and device
CN114841137A (en) * 2022-04-18 2022-08-02 北京百度网讯科技有限公司 Model acquisition method and device, electronic equipment and storage medium
CN114998602B (en) * 2022-08-08 2022-12-30 中国科学技术大学 Domain adaptive learning method and system based on low confidence sample contrast loss
CN116580255B (en) * 2023-07-13 2023-09-26 华南师范大学 Multi-source domain and multi-target domain self-adaption method and device and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103649294A (en) * 2011-04-29 2014-03-19 贝克顿·迪金森公司 Multi-way sorter system and method
CN106056043A (en) * 2016-05-19 2016-10-26 中国科学院自动化研究所 Animal behavior identification method and apparatus based on transfer learning
CN107103364A (en) * 2017-03-28 2017-08-29 上海大学 A kind of task based on many source domain splits transfer learning Forecasting Methodology

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103649294A (en) * 2011-04-29 2014-03-19 贝克顿·迪金森公司 Multi-way sorter system and method
CN106056043A (en) * 2016-05-19 2016-10-26 中国科学院自动化研究所 Animal behavior identification method and apparatus based on transfer learning
CN107103364A (en) * 2017-03-28 2017-08-29 上海大学 A kind of task based on many source domain splits transfer learning Forecasting Methodology

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Pattern classification and clustering: A review of partially supervised learning approaches;Friedhelm Schwenker;《Elsevier》;20141231;第4-14页 *
平行学习—机器学习的一个新型理论框架;李力;《自动化学报》;20170131;第1-7页 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11875270B2 (en) 2020-12-08 2024-01-16 International Business Machines Corporation Adversarial semi-supervised one-shot learning

Also Published As

Publication number Publication date
CN108256561A (en) 2018-07-06

Similar Documents

Publication Publication Date Title
CN108256561B (en) Multi-source domain adaptive migration method and system based on counterstudy
Shu et al. Transferable curriculum for weakly-supervised domain adaptation
Hao et al. An end-to-end architecture for class-incremental object detection with knowledge distillation
Sukhbaatar et al. Learning from noisy labels with deep neural networks
US10719780B2 (en) Efficient machine learning method
Grubb et al. Speedboost: Anytime prediction with uniform near-optimality
EP3767536A1 (en) Latent code for unsupervised domain adaptation
CN114492574A (en) Pseudo label loss unsupervised countermeasure domain adaptive picture classification method based on Gaussian uniform mixing model
Bochinski et al. Deep active learning for in situ plankton classification
Wang et al. Towards realistic predictors
CN113469186B (en) Cross-domain migration image segmentation method based on small number of point labels
CN111832511A (en) Unsupervised pedestrian re-identification method for enhancing sample data
CN113128478B (en) Model training method, pedestrian analysis method, device, equipment and storage medium
De Rosa et al. Online open world recognition
CN114090780B (en) Prompt learning-based rapid picture classification method
CN110705591A (en) Heterogeneous transfer learning method based on optimal subspace learning
CN113035311A (en) Medical image report automatic generation method based on multi-mode attention mechanism
CN104376308B (en) A kind of human motion recognition method based on multi-task learning
CN104680193A (en) Online target classification method and system based on fast similarity network fusion algorithm
WO2015146113A1 (en) Identification dictionary learning system, identification dictionary learning method, and recording medium
CN110991500A (en) Small sample multi-classification method based on nested integrated depth support vector machine
CN114863176A (en) Multi-source domain self-adaptive method based on target domain moving mechanism
Nikpour et al. Deep reinforcement learning in human activity recognition: A survey
Zhang et al. Long-tailed classification with gradual balanced loss and adaptive feature generation
Mund et al. Active online confidence boosting for efficient object classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Lin Jing

Inventor after: Chen Ziliang

Inventor after: Wang Keze

Inventor after: Xu Ruijia

Inventor before: Lin Jing

Inventor before: Chen Ziliang

Inventor before: Wang Keze

Inventor before: Xu Ruijia

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant