CN110135579A - Unsupervised field adaptive method, system and medium based on confrontation study - Google Patents
Unsupervised field adaptive method, system and medium based on confrontation study Download PDFInfo
- Publication number
- CN110135579A CN110135579A CN201910276847.5A CN201910276847A CN110135579A CN 110135579 A CN110135579 A CN 110135579A CN 201910276847 A CN201910276847 A CN 201910276847A CN 110135579 A CN110135579 A CN 110135579A
- Authority
- CN
- China
- Prior art keywords
- domain
- image
- network
- class
- probability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000003044 adaptive effect Effects 0.000 title abstract description 7
- 238000000605 extraction Methods 0.000 claims abstract description 218
- 230000006870 function Effects 0.000 claims description 191
- 230000006978 adaptation Effects 0.000 claims description 39
- 238000009826 distribution Methods 0.000 claims description 31
- 230000003042 antagnostic effect Effects 0.000 claims description 30
- 238000004364 calculation method Methods 0.000 claims description 26
- 239000013598 vector Substances 0.000 claims description 22
- 238000005457 optimization Methods 0.000 claims description 21
- 238000012549 training Methods 0.000 claims description 21
- 239000011159 matrix material Substances 0.000 claims description 13
- 238000013527 convolutional neural network Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 abstract description 4
- 230000008569 process Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000005012 migration Effects 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of unsupervised field adaptive method, system and media based on confrontation study, it include: characteristic extraction step: to the image of source domain and target domain, the feature that image is extracted using feature extraction network, obtains the characteristics of image of source domain and the characteristics of image of target domain;Class prediction step: according to the characteristics of image of the characteristics of image of the source domain of acquisition and target domain, forecast image belongs to the probability of each classification, obtains class prediction probability;Field discriminating step: according to the characteristics of image of the characteristics of image of the source domain of acquisition and target domain, probability of the neural network forecast characteristics of image from source domain and target domain is differentiated by field, obtains domain prediction probability.The present invention can image zooming-out field to source domain and target domain be constant and feature with stronger judgement index, to realize that unsupervised field adapts to.
Description
Technical Field
The present invention relates to methods in the field of computer vision and image processing, and in particular, to unsupervised domain adaptation methods, systems, and media based on counterstudy.
Background
Deep neural network models have received increasing attention because of their superior performance in many areas. Training a deep neural network model often requires a large amount of labeled data. However, it is time and labor consuming to collect a large scale annotated data set for each new task. Fortunately, we can find data for related tasks for other domains, and using these auxiliary data may help reduce the dependency of the current task on annotating a new data set. However, due to differences in data collection methods and the like, data distribution of two different fields tends to be different. Due to the existence of "domain shift", the performance of the model is greatly reduced when the model trained on one domain is directly tested on the other domain. As a branch of migration learning, domain adaptation is just to learn about the "domain shift" problem that exists between never different domains.
According to the number of labeled samples in the target domain, the domain adaptation can be divided into: supervision, semi-supervision and unsupervised domain adaptation. For unsupervised field adaptation, the target field has no corresponding label, and the model needs to train a model which can achieve a better effect on the target field according to labeled source field data and unlabeled target field data. Currently, many unsupervised domain adaptation methods attempt to align the statistical distributions of the source domain and the target domain through various mechanisms, such as maximum mean difference, correlation alignment, KL divergence, and the like. Recently, there is a research to align the distribution of data by extracting features that are indistinguishable by domain discriminators, in a counterlearning manner. Usually, there are two different feature extraction networks for the source domain and the target domain, and the feature extraction network of the source domain is trained in advance and remains fixed in the learning process. In addition, some domain adaptation methods based on image transformation have been developed, which convert images of a source domain into images of a target domain, the images retaining labeling information of the images of the source domain, and then train a classification network based on the converted images for classifying the images of the target domain.
Patent document CN107958286A (application number: 201711183073.9) discloses a deep migration learning method for a domain adaptive network, which determines a value of a loss function of the domain adaptive network according to a distribution difference, a classification error rate and a mismatch degree corresponding to each task related layer, and updates parameters of the domain adaptive network based on the value of the loss function, so that the domain adaptive network adapts to a target domain. However, this patent document does not use a counterlearning method and does not consider the importance of the discrimination power of the features to the field adaptation.
Disclosure of Invention
In view of the defects in the prior art, the present invention aims to provide an unsupervised domain adaptation method, system and medium based on antagonistic learning.
The invention provides an unsupervised field adaptation method based on antagonistic learning, which comprises the following steps:
a characteristic extraction step: extracting the characteristics of the images in the source field and the target field by using a characteristic extraction network to obtain the image characteristics of the source field and the image characteristics of the target field;
a category prediction step: predicting the probability of the image belonging to each category according to the obtained image characteristics of the source field and the target field to obtain category prediction probability;
a field discrimination step: according to the obtained image features of the source field and the image features of the target field, the probability that the network prediction image features come from the source field and the target field is judged through the field, and a field prediction probability is obtained;
and a counterstudy step: designing a loss function for the obtained domain prediction probability, and enabling the feature extraction network and the domain discrimination network to perform counterstudy, so that the feature extraction network can extract domain invariant features;
a characteristic discrimination force improving step: for the obtained image features of the source field, improving the discrimination of the features by using a central loss function;
and (3) a conditional probability alignment step: and performing conditional probability alignment on the obtained image features of the source field and the target field according to the obtained class prediction probability.
Preferably, the feature extraction step:
inputting the images of the source field and the images of the target field into a feature extraction network by utilizing the feature extraction network, and extracting the image features of the source field and the image features of the target field;
the feature extraction network is a deep convolutional neural network;
the source field image and the target field image are from two different distributed images aiming at the same classification task, the source field image is correspondingly labeled, and the target field image is not labeled information.
Preferably, the category predicting step:
predicting the probability of each category by using a classification network consisting of a full connection layer and a softmax layer according to the obtained image characteristics of the source field and the target field to obtain a category prediction probability;
the category refers to a category of an event contained in the image;
the category predicting step includes:
and a probability calculation step: denote the feature extraction network by E, E (x)i) Image x representing feature extraction network extractioniThe dimension is N dimension, C represents a classification network formed by the full connection layer, a total of K classes are preset, the parameter of the full connection layer is an N × K dimension matrix, which is marked as W, and the output of the full connection layer is:
C(E(xi))=WTE(xi)
wherein,
E(xi) Image x representing feature extraction network extractioniThe features of (1);
C(E(xi) C represents the input E (x) of a classification network consisting of fully connected layersi) The output obtained by the rear full link layer;
superscript T denotes transpose;
w represents a matrix with the parameter of the full connection layer being N multiplied by K dimension;
converting the output of the full connection layer into an image x through the softmax layeriProbability for each class, where image xiThe probability of being of class k is:
wherein,
Pk(xi) Representing an image xiProbability of class k;
e represents a natural constant;
[WTE(xi)]kis WTE(xi) A value of the k-th dimension;
after calculating the probability of each image category, the image x can be obtainediThe prediction class (2) of (1), i.e., the class with the highest prediction probability, is as follows:
wherein,
representing an image xiA category prediction tag of (1);
a step of classification network learning: for the image of the marked source field, the image x obtained in the probability calculation stepiFor each class of probabilities compared to the corresponding label, the following classification network loss function can be calculated:
wherein L issRepresenting a class prediction loss function;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEAnd a parameter θ of the class prediction networkCMinimizing class prediction loss;
θEparameters representing a feature extraction network;
θCparameters representing a class prediction network;
(Xs,Ys) A distribution of images and labels representing a source domain;
P(xi) Representing an image xiProbabilities for each category;
xian image representing a source domain;
yithe label of the expression category is in the form of one-hot vector, namely the label of the kth class is a K-dimensional vector with the kth dimension being 1 and the other dimensions being 0;
h represents a cross entropy function;
and learning the classification network C and the feature extraction network E according to the obtained classification network loss function, obtaining the learned classification network C and the feature extraction network E, and returning to the probability calculation step for continuous execution.
Preferably, the domain discriminating step:
image x extracted by feature extraction network EiIs characterized by E (x)i) The dimension is N, D represents a domain discrimination network composed of fully connected layers, and the output is D (E (x)i) Let D (E (x))i) Has dimension 1, is converted to [0,1 ] by sigmoid function h]Interval, h (D (E (x))i) ))) represents an image xiProbability from the source domain, then 1-h (D (E (x)i) ) represents the probability that the image is from the target domain, wherein the sigmoid function can be expressed as:
wherein,
h(D(E(xi) ))) represents an image xiProbability from source domain;
D(E(xi) Represents the output of a domain discrimination network composed of fully connected layers.
The antagonistic learning step:
according to the obtained domain prediction probability, adopting a counterstudy target function to carry out counterstudy on the feature extraction network and the domain discrimination network, enabling the domain discrimination network to distinguish the image of the source domain and the image of the target domain as much as possible, enabling the feature extraction network to extract the invariant features of the domain, thereby confusing the domain discrimination network and enabling the domain discrimination network to carry out misjudgment, even if the domain discrimination network cannot distinguish whether the image features are from the source domain or the target domain;
the domain invariant features refer to image semantic information shared by a source domain and a target domain;
the confrontation learning objective function refers to a feature extraction network minimized confrontation loss function and a domain discrimination network maximized confrontation loss function, and is as follows:
wherein,
Ladvrepresenting a resistance loss function;
the optimization target of the expression feature extraction network E is a minimized countermeasure loss function, and the optimization target of the domain discrimination network D is a maximized countermeasure loss function;
Xsa set of samples representing a source domain;
Xta set of samples representing a target domain;
θDis a parameter of the domain discrimination network.
Preferably, the feature discrimination force increasing step:
setting a class center point for each class, and adding a center loss function to enable the image characteristics of the source field to be close to the center point of the corresponding class, so that samples scattered in inter-class areas are reduced, and the discrimination of the extracted characteristics is improved;
the center loss function: by calculating the euclidean distance of each feature from the center point of the corresponding class, the following center loss function can be obtained:
wherein,
a central loss function representing a minimized source domain;
Lcsa central loss function representing the source domain, which is associated with the feature extraction network E;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the source domain;
cyiis class yiA class center point of (1);
during the first iteration, the center points of each category are initialized by using the data of the current batch, and then the center points are updated in the following way:
wherein,
is the kth classThe center point of the t +1 th iteration;
representing the center point of the kth category at the time of the t-th iteration;
γ is the update rate of the class center point;
k is the number of categories;
representing a center pointThe calculation formula of (2) is as follows:
wherein,
Btis the batch data at the t-th iteration;
i (.) represents the indicator function when yiWhen k is true, I (y)iK) 1, otherwise 0;
Nkis the number of class k samples in the batch.
Preferably, the conditional probability aligning step:
predicting labels according to the obtained categories of the target domain imagesDesigning a graph of a source domain to minimize a central loss function of a target domainThe class conditional probability P (X | Y) of the image and the image of the target field are aligned, so that the image of the target field is close to the corresponding class center, the distribution of the two fields is aligned, and the image characteristics of the target field without labels have discriminative power;
the minimization of the central loss function of the target domain is expressed as follows:
wherein,
a central loss function representing a minimization target area;
Lctrepresenting a center loss function of the target field obtained by calculating Euclidean distances between the sample characteristics of the target field and the corresponding class center points;
the optimization objectives for the representation network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the target area;
representing categoriesA class center point of (1);
Φ(Xt) Is a subset of the target domain, where the samples satisfy the following condition:
Φ(Xt)={xi∈Xtand max(p(xi))>T}
wherein,
p(xi) Represents a K-dimensional vector whose K-th dimension represents a sample xiProbability of being class k;
t represents a threshold value, and a predictive tag is only trusted if its probability is greater than this threshold value.
The invention provides an unsupervised field adaptation system based on antagonistic learning, which comprises the following components:
a feature extraction module: extracting the characteristics of the images in the source field and the target field by using a characteristic extraction network to obtain the image characteristics of the source field and the image characteristics of the target field;
a category prediction module: predicting the probability of the image belonging to each category according to the obtained image characteristics of the source field and the target field to obtain category prediction probability;
a domain discrimination module: according to the obtained image features of the source field and the image features of the target field, the probability that the network prediction image features come from the source field and the target field is judged through the field, and a field prediction probability is obtained;
the confrontation learning module: designing a loss function for the obtained domain prediction probability, and enabling the feature extraction network and the domain discrimination network to perform counterstudy, so that the feature extraction network can extract domain invariant features;
a feature discrimination power boosting module: for the obtained image features of the source field, improving the discrimination of the features by using a central loss function;
a conditional probability alignment module: and performing conditional probability alignment on the obtained image features of the source field and the target field according to the obtained class prediction probability.
Preferably, the feature extraction module:
inputting the images of the source field and the images of the target field into a feature extraction network by utilizing the feature extraction network, and extracting the image features of the source field and the image features of the target field;
the feature extraction network is a deep convolutional neural network;
the source field image and the target field image are from two different distributed images aiming at the same classification task, the source field image is correspondingly labeled, and the target field image is not labeled information;
the category prediction module:
predicting the probability of each category by using a classification network consisting of a full connection layer and a softmax layer according to the obtained image characteristics of the source field and the target field to obtain a category prediction probability;
the category refers to a category of an event contained in the image;
the category prediction module includes:
a probability calculation module: denote the feature extraction network by E, E (x)i) Image x representing feature extraction network extractioniThe dimension is N dimension, C represents a classification network formed by the full connection layer, a total of K classes are preset, the parameter of the full connection layer is an N × K dimension matrix, which is marked as W, and the output of the full connection layer is:
C(E(xi))=WTE(xi)
wherein,
E(xi) Image x representing feature extraction network extractioniThe features of (1);
C(E(xi) C represents the input E (x) of a classification network consisting of fully connected layersi) The output obtained by the rear full link layer;
superscript T denotes transpose;
w represents a matrix with the parameter of the full connection layer being N multiplied by K dimension;
throughsoftmax layer, converting the output of the full connected layer into image xiProbability for each class, where image xiThe probability of being of class k is:
wherein,
Pk(xi) Representing an image xiProbability of class k;
e represents a natural constant;
[WTE(xi)]kis WTE(xi) A value of the k-th dimension;
after calculating the probability of each image category, the image x can be obtainediThe prediction class (2) of (1), i.e., the class with the highest prediction probability, is as follows:
wherein,
representing an image xiA category prediction tag of (1);
the classification network learning module: for the image of the marked source field, the image x obtained by the probability calculation module is usediFor each class of probabilities compared to the corresponding label, the following classification network loss function can be calculated:
wherein L issRepresenting a class prediction loss function;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEAnd a parameter θ of the class prediction networkCMinimizing class prediction loss;
θEparameters representing a feature extraction network;
θCparameters representing a class prediction network;
(Xs,Ys) A distribution of images and labels representing a source domain;
P(xi) Representing an image xiProbabilities for each category;
xian image representing a source domain;
yithe label of the expression category is in the form of one-hot vector, namely the label of the kth class is a K-dimensional vector with the kth dimension being 1 and the other dimensions being 0;
h represents a cross entropy function;
and learning the classification network C and the feature extraction network E according to the obtained classification network loss function, obtaining the learned classification network C and the feature extraction network E, and returning to the probability calculation module for continuous execution.
Preferably, the domain discrimination module:
image x extracted by feature extraction network EiIs characterized by E (x)i) The dimension is N, D represents a domain discrimination network composed of fully connected layers, and the output is D (E (x)i) Let D (E (x))i) Has dimension 1, is converted to [0,1 ] by sigmoid function h]Interval, h (D (E (x))i) ))) represents an image xiProbability from the source domain, then 1-h (D (E (x)i) ) represents the probability that the image is from the target domain, wherein the sigmoid function can be expressed as:
wherein,
h(D(E(xi) ))) represents an image xiProbability from source domain;
D(E(xi) Output of a domain discrimination network composed of all-connected layers;
the confrontation learning module:
according to the obtained domain prediction probability, adopting a counterstudy target function to carry out counterstudy on the feature extraction network and the domain discrimination network, enabling the domain discrimination network to distinguish the image of the source domain and the image of the target domain as much as possible, enabling the feature extraction network to extract the invariant features of the domain, thereby confusing the domain discrimination network and enabling the domain discrimination network to carry out misjudgment, even if the domain discrimination network cannot distinguish whether the image features are from the source domain or the target domain;
the domain invariant features refer to image semantic information shared by a source domain and a target domain;
the confrontation learning objective function refers to a feature extraction network minimized confrontation loss function and a domain discrimination network maximized confrontation loss function, and is as follows:
wherein,
Ladvrepresenting a resistance loss function;
the optimization target of the expression feature extraction network E is a minimized countermeasure loss function, and the optimization target of the domain discrimination network D is a maximized countermeasure loss function;
Xsa set of samples representing a source domain;
Xta set of samples representing a target domain;
θDis a parameter of the domain discrimination network;
the feature discrimination force improving module:
setting a class center point for each class, and adding a center loss function to enable the image characteristics of the source field to be close to the center point of the corresponding class, so that samples scattered in inter-class areas are reduced, and the discrimination of the extracted characteristics is improved;
the center loss function: by calculating the euclidean distance of each feature from the center point of the corresponding class, the following center loss function can be obtained:
wherein,
a central loss function representing a minimized source domain;
Lcsa central loss function representing the source domain, which is associated with the feature extraction network E;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the source domain;
is class yiA class center point of (1);
during the first iteration, the center points of each category are initialized by using the data of the current batch, and then the center points are updated in the following way:
wherein,
is the center point of the kth class at the t +1 th iteration;
representing the center point of the kth category at the time of the t-th iteration;
γ is the update rate of the class center point;
k is the number of categories;
representing a center pointThe calculation formula of (2) is as follows:
wherein,
Btis the batch data at the t-th iteration;
i (.) represents the indicator function when yiWhen k is true, I (y)iK) 1, otherwise 0;
Nkis the number of class k samples in the batch;
the conditional probability alignment module:
predicting labels according to the obtained categories of the target domain imagesDesigning a central loss function of a minimized target field to align the class conditional probabilities P (X | Y) of the images of the source field and the target field, so that the images of the target field are close to the corresponding class centers, thereby aligning the distribution of the two fields and enabling the image characteristics of the target field without annotation to have discriminative power;
the minimization of the central loss function of the target domain is expressed as follows:
wherein,
a central loss function representing a minimization target area;
Lctrepresenting a center loss function of the target field obtained by calculating Euclidean distances between the sample characteristics of the target field and the corresponding class center points;
the optimization objectives for the representation network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the target area;
representing categoriesA class center point of (1);
Φ(Xt) Is a subset of the target domain, where the samples satisfy the following condition:
Φ(Xt)={xi∈Xtand max(p(xi))>T}
wherein,
p(xi) Represents a K-dimensional vector whose K-th dimension represents a sample xiProbability of being class k;
t represents a threshold value, and a predictive tag is only trusted if its probability is greater than this threshold value.
According to the present invention, there is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the unsupervised domain adaptation method based on opponent learning of any one of the above.
Compared with the prior art, the invention has the following beneficial effects:
1. the method can be used for realizing the characteristics of unchanged image extraction field and stronger discrimination of the source field and the target field, thereby realizing the unsupervised field adaptation.
2. The invention forces the feature extraction network to extract the features shared between the two domains, namely the semantic information of the image, and ignores the specific information of each domain by sharing the same feature extraction network between the two domains and making the domain discrimination network unable to judge the source of the features. Since the extracted features are domain invariant, the class prediction network obtained by image training in the source domain can also be applied to the target domain, thereby realizing domain adaptation. By introducing the central loss function, the features of the images of the same category in the source field can be more gathered and compact, so that the discrimination of the features is improved. In addition to the marginal probability distribution, the invention also considers the conditional probability distribution of the two domains, and the characteristics of the target domain also keep better discrimination capability by aligning the conditional probability distribution of the two domains, thereby improving the class prediction accuracy of the model in the target domain. In addition, by sharing the same feature extraction network, the image features of the two domains are jointly learned, and at the time of testing, it is not necessary to know whether the image is from the source domain or the target domain.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
fig. 1 is a schematic flow chart of a method provided in preferred embodiment 2 of the present invention.
Fig. 2 is a schematic diagram of the system principle provided by the preferred embodiment 2 of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
The invention provides an unsupervised field adaptation method based on antagonistic learning, which comprises the following steps:
a characteristic extraction step: extracting the characteristics of the images in the source field and the target field by using a characteristic extraction network to obtain the image characteristics of the source field and the image characteristics of the target field;
a category prediction step: predicting the probability of the image belonging to each category according to the obtained image characteristics of the source field and the target field to obtain category prediction probability;
a field discrimination step: according to the obtained image features of the source field and the image features of the target field, the probability that the network prediction image features come from the source field and the target field is judged through the field, and a field prediction probability is obtained;
and a counterstudy step: designing a loss function for the obtained domain prediction probability, and enabling the feature extraction network and the domain discrimination network to perform counterstudy, so that the feature extraction network can extract domain invariant features;
a characteristic discrimination force improving step: for the obtained image features of the source field, improving the discrimination of the features by using a central loss function;
and (3) a conditional probability alignment step: and performing conditional probability alignment on the obtained image features of the source field and the target field according to the obtained class prediction probability.
Specifically, the feature extraction step:
inputting the images of the source field and the images of the target field into a feature extraction network by utilizing the feature extraction network, and extracting the image features of the source field and the image features of the target field;
the feature extraction network is a deep convolutional neural network;
the source field image and the target field image are from two different distributed images aiming at the same classification task, the source field image is correspondingly labeled, and the target field image is not labeled information.
Specifically, the category prediction step:
predicting the probability of each category by using a classification network consisting of a full connection layer and a softmax layer according to the obtained image characteristics of the source field and the target field to obtain a category prediction probability;
the category refers to a category of an event contained in the image;
the category predicting step includes:
and a probability calculation step: denote the feature extraction network by E, E (x)i) Image x representing feature extraction network extractioniThe dimension is N dimension, C represents a classification network formed by the full connection layer, a total of K classes are preset, the parameter of the full connection layer is an N × K dimension matrix, which is marked as W, and the output of the full connection layer is:
C(E(xi))=WTE(xi)
wherein,
E(xi) Image x representing feature extraction network extractioniThe features of (1);
C(E(xi) C represents the input E (x) of a classification network consisting of fully connected layersi) The output obtained by the rear full link layer;
superscript T denotes transpose;
w represents a matrix with the parameter of the full connection layer being N multiplied by K dimension;
converting the output of the full connection layer into an image x through the softmax layeriProbability for each class, where image xiThe probability of being of class k is:
wherein,
Pk(xi) Representing an image xiProbability of class k;
e represents a natural constant;
[WTE(xi)]kis WTE(xi) A value of the k-th dimension;
after calculating the probability of each image category, the image x can be obtainediThe prediction class (2) of (1), i.e., the class with the highest prediction probability, is as follows:
wherein,
representing an image xiA category prediction tag of (1);
a step of classification network learning: for the image of the marked source field, the image x obtained in the probability calculation stepiFor each class of probabilities compared to the corresponding label, the following classification network loss function can be calculated:
wherein L issRepresenting a class prediction loss function;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEAnd a parameter θ of the class prediction networkCMinimizing class prediction loss;
θEparameters representing a feature extraction network;
θCparameters representing a class prediction network;
(Xs,Ys) A distribution of images and labels representing a source domain;
P(xi) Representing an image xiProbabilities for each category;
xian image representing a source domain;
yithe label of the expression category is in the form of one-hot vector, namely the label of the kth class is a K-dimensional vector with the kth dimension being 1 and the other dimensions being 0;
h represents a cross entropy function;
and learning the classification network C and the feature extraction network E according to the obtained classification network loss function, obtaining the learned classification network C and the feature extraction network E, and returning to the probability calculation step for continuous execution.
Specifically, the domain discriminating step:
image x extracted by feature extraction network EiIs characterized by E (x)i) The dimension is N, D represents a domain discrimination network composed of fully connected layers, and the output is D (E (x)i) Let D (E (x))i) Has dimension 1, is converted to [0,1 ] by sigmoid function h]Interval, h (D (E (x))i) ))) represents an image xiProbability from the source domain, then 1-h (D (E (x)i) ) represents the probability that the image is from the target domain, wherein the sigmoid function can be expressed as:
wherein,
h(D(E(xi) ))) represents an image xiProbability from source domain;
D(E(xi) Represents the output of a domain discrimination network composed of fully connected layers.
The antagonistic learning step:
according to the obtained domain prediction probability, adopting a counterstudy target function to carry out counterstudy on the feature extraction network and the domain discrimination network, enabling the domain discrimination network to distinguish the image of the source domain and the image of the target domain as much as possible, enabling the feature extraction network to extract the invariant features of the domain, thereby confusing the domain discrimination network and enabling the domain discrimination network to carry out misjudgment, even if the domain discrimination network cannot distinguish whether the image features are from the source domain or the target domain;
the domain invariant features refer to image semantic information shared by a source domain and a target domain;
the confrontation learning objective function refers to a feature extraction network minimized confrontation loss function and a domain discrimination network maximized confrontation loss function, and is as follows:
wherein,
Ladvrepresenting a resistance loss function;
the optimization target of the expression feature extraction network E is a minimized countermeasure loss function, and the optimization target of the domain discrimination network D is a maximized countermeasure loss function;
Xsa set of samples representing a source domain;
Xta set of samples representing a target domain;
θDis a parameter of the domain discrimination network.
Specifically, the feature discrimination power raising step:
setting a class center point for each class, and adding a center loss function to enable the image characteristics of the source field to be close to the center point of the corresponding class, so that samples scattered in inter-class areas are reduced, and the discrimination of the extracted characteristics is improved;
the center loss function: by calculating the euclidean distance of each feature from the center point of the corresponding class, the following center loss function can be obtained:
wherein,
a central loss function representing a minimized source domain;
Lcsa central loss function representing the source domain, which is associated with the feature extraction network E;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the source domain;
is class yiA class center point of (1);
during the first iteration, the center points of each category are initialized by using the data of the current batch, and then the center points are updated in the following way:
wherein,
is the center point of the kth class at the t +1 th iteration;
representing the center point of the kth category at the time of the t-th iteration;
γ is the update rate of the class center point;
k is the number of categories;
representing a center pointThe calculation formula of (2) is as follows:
wherein,
Btis the batch data at the t-th iteration;
i (.) represents the indicator function when yiWhen k is true, I (y)iK) 1, otherwise 0;
Nkis the number of class k samples in the batch.
Specifically, the conditional probability aligning step:
predicting labels according to the obtained categories of the target domain imagesDesigning a central loss function of a minimized target field to align the class conditional probabilities P (X | Y) of the images of the source field and the target field, so that the images of the target field are close to the corresponding class centers, thereby aligning the distribution of the two fields and enabling the image characteristics of the target field without annotation to have discriminative power;
the minimization of the central loss function of the target domain is expressed as follows:
wherein,
a central loss function representing a minimization target area;
Lctrepresenting a center loss function of the target field obtained by calculating Euclidean distances between the sample characteristics of the target field and the corresponding class center points;
the optimization objectives for the representation network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the target area;
representing categoriesA class center point of (1);
Φ(Xt) Is a subset of the target domain, where the samples satisfy the following condition:
Φ(Xt)={xi∈Xtand max(p(xi))>T}
wherein,
p(xi) Represents a K-dimensional vector whose K-th dimension represents a sample xiProbability of being class k;
t represents a threshold value, and a predictive tag is only trusted if its probability is greater than this threshold value.
The unsupervised field adaptation system based on the antagonistic learning can be realized through the step flow of the unsupervised field adaptation method based on the antagonistic learning. The person skilled in the art can understand the unsupervised domain adaptation method based on antagonistic learning as a preferred example of the unsupervised domain adaptation system based on antagonistic learning.
The invention provides an unsupervised field adaptation system based on antagonistic learning, which comprises the following components:
a feature extraction module: extracting the characteristics of the images in the source field and the target field by using a characteristic extraction network to obtain the image characteristics of the source field and the image characteristics of the target field;
a category prediction module: predicting the probability of the image belonging to each category according to the obtained image characteristics of the source field and the target field to obtain category prediction probability;
a domain discrimination module: according to the obtained image features of the source field and the image features of the target field, the probability that the network prediction image features come from the source field and the target field is judged through the field, and a field prediction probability is obtained;
the confrontation learning module: designing a loss function for the obtained domain prediction probability, and enabling the feature extraction network and the domain discrimination network to perform counterstudy, so that the feature extraction network can extract domain invariant features;
a feature discrimination power boosting module: for the obtained image features of the source field, improving the discrimination of the features by using a central loss function;
a conditional probability alignment module: and performing conditional probability alignment on the obtained image features of the source field and the target field according to the obtained class prediction probability.
Specifically, the feature extraction module:
inputting the images of the source field and the images of the target field into a feature extraction network by utilizing the feature extraction network, and extracting the image features of the source field and the image features of the target field;
the feature extraction network is a deep convolutional neural network;
the source field image and the target field image are from two different distributed images aiming at the same classification task, the source field image is correspondingly labeled, and the target field image is not labeled information;
the category prediction module:
predicting the probability of each category by using a classification network consisting of a full connection layer and a softmax layer according to the obtained image characteristics of the source field and the target field to obtain a category prediction probability;
the category refers to a category of an event contained in the image;
the category prediction module includes:
a probability calculation module: denote the feature extraction network by E, E (x)i) Image x representing feature extraction network extractioniIs characterized in that the dimension is N dimension, C represents the component of the full connection layerIf a class network is preset with a total of K classes, the parameter of the full connection layer is an N × K dimensional matrix, which is denoted as W, and the output of the full connection layer is:
C(E(xi))=WTE(xi)
wherein,
E(xi) Image x representing feature extraction network extractioniThe features of (1);
C(E(xi) C represents the input E (x) of a classification network consisting of fully connected layersi) The output obtained by the rear full link layer;
superscript T denotes transpose;
w represents a matrix with the parameter of the full connection layer being N multiplied by K dimension;
converting the output of the full connection layer into an image x through the softmax layeriProbability for each class, where image xiThe probability of being of class k is:
wherein,
Pk(xi) Representing an image xiProbability of class k;
e represents a natural constant;
[WTE(xi)]kis WTE(xi) A value of the k-th dimension;
after calculating the probability of each image category, the image x can be obtainediThe prediction class (2) of (1), i.e., the class with the highest prediction probability, is as follows:
wherein,
representing an image xiA category prediction tag of (1);
the classification network learning module: for the image of the marked source field, the image x obtained by the probability calculation module is usediFor each class of probabilities compared to the corresponding label, the following classification network loss function can be calculated:
wherein L issRepresenting a class prediction loss function;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEAnd a parameter θ of the class prediction networkCMinimizing class prediction loss;
θEparameters representing a feature extraction network;
θCparameters representing a class prediction network;
(Xs,Ys) A distribution of images and labels representing a source domain;
P(xi) Representing an image xiProbabilities for each category;
xian image representing a source domain;
yirepresenting class labels, in the form of one-hot vectors, i.e.The labels of the kth class are K-dimensional vectors with the kth dimension being 1 and the other dimensions being 0;
h represents a cross entropy function;
and learning the classification network C and the feature extraction network E according to the obtained classification network loss function, obtaining the learned classification network C and the feature extraction network E, and returning to the probability calculation module for continuous execution.
Specifically, the domain discrimination module:
image x extracted by feature extraction network EiIs characterized by E (x)i) The dimension is N, D represents a domain discrimination network composed of fully connected layers, and the output is D (E (x)i) Let D (E (x))i) Has dimension 1, is converted to [0,1 ] by sigmoid function h]Interval, h (D (E (x))i) ))) represents an image xiProbability from the source domain, then 1-h (D (E (x)i) ) represents the probability that the image is from the target domain, wherein the sigmoid function can be expressed as:
wherein,
h(D(E(xi) ))) represents an image xiProbability from source domain;
D(E(xi) Output of a domain discrimination network composed of all-connected layers;
the confrontation learning module:
according to the obtained domain prediction probability, adopting a counterstudy target function to carry out counterstudy on the feature extraction network and the domain discrimination network, enabling the domain discrimination network to distinguish the image of the source domain and the image of the target domain as much as possible, enabling the feature extraction network to extract the invariant features of the domain, thereby confusing the domain discrimination network and enabling the domain discrimination network to carry out misjudgment, even if the domain discrimination network cannot distinguish whether the image features are from the source domain or the target domain;
the domain invariant features refer to image semantic information shared by a source domain and a target domain;
the confrontation learning objective function refers to a feature extraction network minimized confrontation loss function and a domain discrimination network maximized confrontation loss function, and is as follows:
wherein,
Ladvrepresenting a resistance loss function;
the optimization target of the expression feature extraction network E is a minimized countermeasure loss function, and the optimization target of the domain discrimination network D is a maximized countermeasure loss function;
Xsa set of samples representing a source domain;
Xta set of samples representing a target domain;
θDis a parameter of the domain discrimination network;
the feature discrimination force improving module:
setting a class center point for each class, and adding a center loss function to enable the image characteristics of the source field to be close to the center point of the corresponding class, so that samples scattered in inter-class areas are reduced, and the discrimination of the extracted characteristics is improved;
the center loss function: by calculating the euclidean distance of each feature from the center point of the corresponding class, the following center loss function can be obtained:
wherein,
a central loss function representing a minimized source domain;
Lcsa central loss function representing the source domain, which is associated with the feature extraction network E;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the source domain;
is class yiA class center point of (1);
during the first iteration, the center points of each category are initialized by using the data of the current batch, and then the center points are updated in the following way:
wherein,
is the center point of the kth class at the t +1 th iteration;
representing the center point of the kth category at the time of the t-th iteration;
γ is the update rate of the class center point;
k is the number of categories;
representing a center pointThe calculation formula of (2) is as follows:
wherein,
Btis the batch data at the t-th iteration;
i (.) represents the indicator function when yiWhen k is true, I (y)iK) 1, otherwise 0;
Nkis the number of class k samples in the batch;
the conditional probability alignment module:
predicting labels according to the obtained categories of the target domain imagesDesigning a central loss function of a minimized target field to align the class conditional probabilities P (X | Y) of the images of the source field and the target field, so that the images of the target field are close to the corresponding class centers, thereby aligning the distribution of the two fields and enabling the image characteristics of the target field without annotation to have discriminative power;
the minimization of the central loss function of the target domain is expressed as follows:
wherein,
a central loss function representing a minimization target area;
Lctrepresenting a center loss function of the target field obtained by calculating Euclidean distances between the sample characteristics of the target field and the corresponding class center points;
the optimization objectives for the representation network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the target area;
representing categoriesA class center point of (1);
Φ(Xt) Is a subset of the target domain, where the samples satisfy the following condition:
Φ(Xt)={xi∈Xtand max(p(xi))>T}
wherein,
p(xi) Represents a K-dimensional vector whose K-th dimension represents a sample xiProbability of being class k;
t represents a threshold value, and a predictive tag is only trusted if its probability is greater than this threshold value.
According to the present invention, there is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the unsupervised domain adaptation method based on opponent learning of any one of the above.
The present invention will be described more specifically below with reference to preferred examples.
Preferred example 1:
an unsupervised domain adaptation method based on domain invariant antagonistic learning comprises the following steps:
a characteristic extraction step: extracting the features of the images of the source field and the target field by using a depth convolution neural network to obtain the image features of the source field and the image features of the target field;
a category prediction step: predicting the probability of each category by using a classification network consisting of a full connection layer and a softmax layer according to the obtained image characteristics of the source field and the target field; the category refers to a category of an event contained in the image, such as a cat, a dog, a car, and the like.
A field discrimination step: the probability that the features come from the source field and the target field is predicted through a field discrimination network for the image features of the source field and the image features of the target field obtained in the feature extraction step;
and a counterstudy step: designing a loss function for the domain prediction probability obtained in the domain distinguishing step, and carrying out counterstudy on the feature extraction network and the domain distinguishing network so that the feature extraction network can extract domain-invariant features, namely image semantic information shared by the source domain and the target domain;
a characteristic discrimination force improving step: for the image characteristics of the source field obtained in the characteristic extraction step, improving the discrimination of the characteristics by using a central loss function;
and (3) a conditional probability alignment step: and performing conditional probability alignment on the image characteristics of the source field and the image characteristics of the target field obtained in the characteristic extraction step.
The feature extraction step, wherein:
inputting the image of the source field and the image of the target field into a shared feature extraction network by using a deep convolutional neural network model, extracting the image features of the source field and the image features of the target field, namely semantic information shared by the source field and the target field, and neglecting information related to the fields;
the source field image and the target field image are from two different distributed images aiming at the same classification task, the source field image is correspondingly labeled, and the target field image is not labeled information.
The class prediction step, wherein: and performing class prediction on the image in the source field and the image in the target field by utilizing a classification network consisting of a full connection layer and a softmax layer.
The category prediction step specifically comprises the following steps:
we denote the feature extraction network by E, E (x)i) Image x representing feature extraction network extractioniThe dimensionality is N-dimensional, and C represents a classification network formed by full connection layers. Assuming a total of K classes, the parameters of the fully-connected layer are N × K dimensional matrix, denoted as W, and the output of the fully-connected layer is:
C(E(xi))=WTE(xi)
converting the output of the full connection layer into an image x through the softmax layeriProbability for each class, where image xiThe probability of being of class k is:
wherein [ W ]TE(xi)]kIs WTE(xi) The value of the k-th dimension. Computing images into categoriesAfter the probability of (2), we can get the image xiThe prediction class of (2), i.e. the class with the highest prediction probability:
for the image of the source domain with the label, comparing the probability obtained in the class prediction step with the corresponding label, the following loss function can be calculated: (the loss function is used to help the classification network learn, so that the model can learn the classification of the predicted image)
Wherein L issRepresenting a class prediction loss function;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEAnd a parameter θ of the class prediction networkCMinimizing class prediction loss;
θEparameters representing a feature extraction network;
θCparameters representing a class prediction network;
(Xs,Ys) A distribution of images and labels representing a source domain;
xian image representing a source domain;
yithe label of the expression category is in the form of one-hot vector, namely the label of the kth class is a K-dimensional vector with the kth dimension being 1 and the other dimensions being 0;
h denotes a cross entropy function.
The field discrimination step, wherein: the domain discrimination network is a network composed of fully-connected layers (here, the fully-connected layer is a 3-layer fully-connected layer, and the learned parameters are different, and therefore different from the network composed of fully-connected layers above), and it can output the probability that the features belong to the source domain and the target domain.
The field discrimination step specifically comprises the following steps:
image x extracted by hypothetical feature extraction network EiIs characterized by E (x)i) The dimension is N, D represents a domain discrimination network composed of 3 layers of fully connected layers, and the output is D (E (x)i)). Since there are a total of 2 fields (source field and target field), we order D (E (x)i) Has dimension 1, is converted to [0,1 ] by sigmoid function h]Interval, h (D (E (x))i) ))) represents an image xiProbability from the source domain, then 1-h (D (E (x)i) ) represents the probability that the image is from the target domain. Wherein the sigmoid function can be expressed as:
the antagonistic learning step, wherein: the feature extraction network and the domain discrimination network play a min-max game (which means that the feature extraction network minimizes an opposition loss function and the domain discrimination network maximizes an opposition loss function), the domain discrimination network distinguishes images of a source domain and images of a target domain as much as possible, the feature extraction network confuses the domain discrimination network as much as possible, and the domain discrimination network cannot distinguish which domain the feature comes from by extracting the feature with unchanged domains.
The counterstudy step is as follows:
the feature extraction network and the domain discrimination network play a min-max game (which means a feature extraction network minimized countermeasure loss function and a domain discrimination network maximized countermeasure loss function), and the objective function is as follows:
wherein,
Ladvrepresenting a resistance loss function;
the optimization target of the expression feature extraction network E is a minimized countermeasure loss function, and the optimization target of the domain discrimination network D is a maximized countermeasure loss function;
Xsa set of samples representing a source domain;
Xta set of samples representing a target domain;
θEis a parameter of the feature extraction network;
θDis a parameter of the domain discrimination network;
h(D(E(xi) ))) is an image xiThe probability from the source domain is,
through such counterstudy, the domain discrimination network makes the probability that the predicted image of the source domain comes from the source domain as large as possible, and the probability that the image of the target domain comes from the source domain as small as possible; the feature extraction network has an opposite effect and aims to extract features with invariable domains, so that the domain discrimination network is confused and misjudged.
The feature discrimination force improvement step, wherein: assuming that each class has a class center point, by adding a center loss function, the image features of the source field will be close to the center point of the corresponding class, thereby reducing samples scattered in the inter-class area and improving the discrimination of the extracted features.
The characteristic discrimination force improving step specifically comprises the following steps:
assuming that each class has a class center point, after obtaining the features of the samples in the source domain, by calculating the euclidean distance between each feature and the corresponding class center point, the following center loss function can be obtained:
wherein,
central loss function representing minimized source domain
LcsA central loss function representing the source domain, which is associated with the feature extraction network E;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the source domain;
is class yiThe class center point of (2).
Usually, the central point of each class is calculated from the features of the images belonging to the class in the entire dataset, but since the entire dataset cannot be calculated due to the batch processing method adopted during the network training, the central points of the classes are continuously updated in the network training process by the iterative update method described below. During the first iteration, the center points of each category are initialized by using the data of the current batch, and then the center points are updated in the following way:
wherein,
is the center point of the kth class at the t +1 th iteration,
representing the center point of the kth category at the time of the t-th iteration;
gamma is the update rate of the class center point,
k is the number of categories by which the user can access the content,
representing a center pointThe calculation formula of (2) is as follows:
wherein,
Btis the batch data at the t-th iteration;
i (.) represents the indicator function when yiWhen k is true, I (y)iK) 1, otherwise 0;
Nkis the number of class k samples in the batch,
the conditional probability aligning step, wherein: due to lack of visionSince it is difficult to directly align the conditional probabilities P (Y | X) of the two domains with the image labels of the target domain, we approximately align the conditional probabilities by aligning the conditional probabilities P (X | Y) of the two domains. The category prediction label of the target domain image obtained by the category prediction step is utilizedThe class conditional probabilities P (X | Y) of the image of the source field and the image of the target field are aligned, so that the image of the target field is close to the corresponding class center, the distribution of the two fields is better aligned, and the image characteristics of the target field without labels have better discrimination.
The conditional probability aligning step is specifically as follows:
in addition to the edge probability distribution, the conditional probability distributions of the two domains are usually different, i.e., P (Y | X) is different. Since the target domain has no labeling information, it is difficult to directly align the conditional probability distributions of the two domains, and therefore, the target domain is approximated by an alignment-like edge distribution P (X | Y). Using the class of the image of the target domain obtained in the class prediction step, the following loss function is designed to align P (X | Y):
wherein,
a central loss function representing a minimization target area;
Lctrepresenting a center loss function of the target field obtained by calculating Euclidean distances between the sample characteristics of the target field and the corresponding class center points;
the optimization objectives for the representation network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the target area;
is a sample xiThe prediction tag of (a) is determined,
Φ(Xt) Is a subset of the target domain, where the samples satisfy the following condition:
Φ(Xt)={xi∈Xtand max(p(xi))>T}
wherein, p (x)i) Is a K-dimensional vector whose K-th dimension represents sample xiIs the probability of class k, T is a threshold, and a predicted tag is considered to be authentic only if its probability is greater than this threshold.
Preferred example 2:
as shown in fig. 1, which is a flowchart of an embodiment of an unsupervised domain adaptation method based on domain-invariant antagonistic learning according to the present invention, the method processes an image of a source domain and an image of a target domain into an image feature of the source domain and an image feature of the target domain through a feature extraction step, predicts a class of the image of the source domain and the image of the target domain using a class prediction step, and makes the extracted features domain-invariant through a domain discrimination step and an antagonistic learning step, so that a class prediction network obtained by optimizing the image of the source domain can also be applied to the target domain to implement domain adaptation. Through the step of improving the feature discrimination, the features of the images of the same category in the source field are more aggregated and compact, so that the discrimination of the features is improved. In addition, the conditional probability aligning step aligns the conditional probabilities of the two domains, so that the characteristics of the target domain also keep better discrimination capability, and the category prediction accuracy in the target domain is improved.
The invention forces the feature extraction network to extract the shared features between the two domains, namely the semantic information of the image, and ignores the specific information of the two domains by sharing the same feature extraction network between the two domains and making the domain discrimination network unable to judge the source of the features. By sharing the same feature extraction network, the image features of the two domains are jointly learned, and at the time of testing, it is not necessary to know whether the image is from the source domain or the target domain.
Specifically, with reference to fig. 1, the method comprises the steps of:
a characteristic extraction step: extracting the characteristics of the images in the source field and the target field by using a depth convolution neural network to obtain the image characteristics in the source field and the image characteristics in the target field;
a category prediction step: the probability that the image features of the source field and the image features of the target field obtained in the feature extraction step are classified into various categories by using a classification network prediction image formed by a full connection layer;
a field discrimination step: the probability that the features come from the source field and the target field is predicted through a field discrimination network for the image features of the source field and the image features of the target field obtained in the feature extraction step;
and a counterstudy step: designing a loss function for the domain prediction probability obtained in the domain distinguishing step, and performing counterstudy on the feature extraction network and the domain distinguishing network so that the feature extraction network can extract domain-invariant features, namely image semantic information shared by the source domain and the target domain;
a characteristic discrimination force improving step: for the image characteristics of the source field obtained in the characteristic extraction step, improving the discrimination of the characteristics by using a central loss function;
and (3) a conditional probability alignment step: and performing conditional probability alignment on the image characteristics of the source field and the image characteristics of the target field obtained in the characteristic extraction step.
Corresponding to the method, the invention also provides an embodiment of an unsupervised domain adaptation system based on domain-invariant antagonistic learning, which comprises the following steps:
a feature extraction module: extracting the characteristics of the images in the source field and the target field by using a depth convolution neural network to obtain the image characteristics in the source field and the image characteristics in the target field;
a category prediction module: the probability that the image features of the source field and the image features of the target field obtained by the feature extraction module are classified into various categories by using a classification network predicted image formed by a full connection layer;
a domain discrimination module: the probability that the features come from the source field and the target field is predicted through a field discrimination network for the image features of the source field and the image features of the target field obtained by the feature extraction module;
the confrontation learning module: designing a loss function for the domain prediction probability obtained by the domain discrimination module, and performing counterstudy on the feature extraction network and the domain discrimination network so that the feature extraction network can extract domain-invariant features, namely image semantic information shared by the source domain and the target domain;
a feature discrimination power boosting module: for the image characteristics of the source field obtained by the characteristic extraction module, improving the discrimination of the characteristics by using a central loss function;
a conditional probability alignment module: and performing conditional probability alignment on the image characteristics of the source field and the image characteristics of the target field obtained by the characteristic extraction module.
Technical features realized by each module of the unsupervised domain adaptation system based on the domain-invariant antagonistic learning can be the same as technical features realized by corresponding steps in the unsupervised domain adaptation method based on the domain-invariant antagonistic learning.
Specific implementations of the above steps and modules are described in detail below to facilitate understanding of the technical solutions of the present invention.
In some embodiments of the present invention, the feature extraction step, wherein: and inputting the image of the source field and the image of the target field into a shared feature extraction network by using a deep convolutional neural network model, extracting field-invariant features, namely semantic information shared by the source field and the target field, and neglecting information related to the fields. The source field image and the target field image are from two different distributed images aiming at the same classification task, the source field image is correspondingly labeled, and the target field image is not labeled information.
In some embodiments of the present invention, the category predicting step, wherein: and performing class prediction on the image of the source field and the image of the target field by utilizing a classification network formed by full connection layers.
In some embodiments of the present invention, the domain identifying step includes: the domain discrimination network is a network composed of full connection layers and can output the probability that the features belong to the source domain and the target domain.
In some embodiments of the invention, the antagonistic learning step, wherein: the feature extraction network and the domain discrimination network play a min-max game, the domain discrimination network distinguishes images of a source domain and images of a target domain as much as possible, the feature extraction network confuses the domain discrimination network as much as possible, and the domain discrimination network cannot distinguish which domain the features come from by extracting the features of which the domain is invariant.
In some embodiments of the present invention, the feature determination force increasing step includes: assuming that each class has a class center point, by adding a center loss function, the image features of the source field will be close to the center point of the corresponding class, thereby reducing samples scattered in the inter-class area and improving the discrimination of the extracted features.
In some embodiments of the present invention, the conditional probability aligning step includes: and aligning the class conditional probabilities P (X | Y) of the images of the source field and the target field by using the class labels obtained in the class prediction step, so that the images of the target field are close to the corresponding class center, and the image characteristics of the unmarked target field have better discrimination.
Specifically, a domain adaptive system network framework composed of a feature extraction module, a category prediction module, a domain discrimination module, a countercheck learning module, a feature discrimination enhancement module and a conditional probability alignment module is shown in fig. 2, and the whole system framework can be trained end to end.
In the system framework of the embodiment shown in fig. 2, the image of the source domain and the image of the target domain are input to the feature extraction module, and the feature of the image of the source domain and the feature of the image of the target domain are output, the feature extraction module is a down-sampling module composed of a series of convolution layers (+ batch norm layer + relu layer), and the existing network structure, such as Alexnet, VGG, Resnet, etc., can be used. The characteristics of the image of the source field are input into a category prediction module to predict the category of the image, and the following loss function is generated:
wherein, thetaEIs a parameter of the feature extraction network, θCIs a parameter of the class prediction network, (X)s,Ys) Distribution of images and labels representing source domain, xiImage representing the source domain, yiThe method is a class label, E represents a feature extraction network, C represents a class prediction network, and H represents a cross entropy function.
In order to extract the domain invariant features, as shown in fig. 2, the extracted image features of the source domain and the extracted image features of the target domain pass through a domain discrimination module, and the domain discrimination network is a network formed by full connection layers and can output probabilities that the features belong to the source domain and the target domain. The feature extraction network and the domain discrimination network play a min-max game, the domain discrimination network distinguishes images of a source domain and images of a target domain as much as possible, the feature extraction network confuses the domain discrimination network as much as possible, and the domain discrimination network cannot distinguish which domain the features come from by extracting the features of which the domain is invariant. The specific objective function is as follows:
wherein, thetaEIs a parameter of the feature extraction network, θDIs a parameter of the domain discrimination network, D (E (x)i) Is an image xiProbability from the source domain, by which the domain discrimination network makes the probability of the predicted image of the source domain from the source domain as large as possible and the probability of the image of the target domain from the source domain as small as possible; the feature extraction network has an opposite effect and aims to extract features with unchanged fields, so that the field discrimination network is confused and misjudged.
In order to make the extracted source domain features more discriminative, as shown in fig. 2, the extracted image features of the source domain are passed through a feature discrimination boosting module, and assuming that each class has a class center, the features of the image of the source domain need to be close to the center of its corresponding class, so as to reduce the number of samples of the inter-class region, and thus define the following center loss function:
wherein,is class yiThe class center point of (2). Usually, the central point of each category is calculated from the features of the images belonging to the category in the entire dataset, and since the entire dataset cannot be used for calculation due to the batch processing mode adopted during the network training, the central points of the categories are continuously updated in the network training process by the iterative update mode described below. At the first iteration, using the current batchThe data initializes the center point for each category and then updates the center point in the following manner:
wherein,is the center point of the kth class at iteration t +1, gamma is the update rate of the class center point, K is the number of classes,
wherein, BtIs the batch data at the t-th iteration; i (.) represents the indicator function when yiWhen k is true, I (y)iK) 1, otherwise 0; n is a radical ofkIs the number of class k samples in the batch,
through the center loss function, the features of the images of the same category obtained by the feature extraction module can be gathered together, so that the discrimination of the features is increased.
In addition to the edge probability distribution, the conditional probability distributions of the two domains are also different, i.e., P (Y | X) is different. Since the target domain has no labeling information, it is difficult to directly align the conditional probability distributions of the two domains, and therefore, the target domain is approximated by an alignment-like edge distribution P (X | Y). Designing the following loss function by using the image category of the target field obtained in the category prediction step:
wherein,is a sample xiPredictive label of phi (X)t) Is a subset of the target domain, where the samples satisfy the following condition:
Φ(Xt)={xi∈Xtand max(p(xi))>T}
wherein, p (x)i) Is a K-dimensional vector whose K-th dimension represents sample xiIs the probability of class k, T is a threshold, and a predicted tag is considered to be authentic only if its probability is greater than this threshold.
In summary, the same feature extraction network is shared between the two domains, and the domain discrimination network cannot judge the source of the feature, so that the feature extraction network is forced to extract the feature shared between the two domains, namely semantic information of the image, and ignore the specific information of each domain. Since the extracted features are domain invariant, the class prediction network obtained by image training in the source domain can also be applied to the target domain, thereby realizing domain adaptation. Due to the introduction of the central loss function, the features of the images of the same category in the source field can be more gathered and compact, and therefore the discrimination of the features is improved. In addition, by aligning the conditional probability distribution of the two domains, the characteristics of the target domain also keep better discrimination capability, so that the category prediction accuracy rate on the target domain is improved. In addition, by sharing the same feature extraction network, the image features of the two domains are jointly learned, and at the time of testing, it is not necessary to know whether the image is from the source domain or the target domain.
Those skilled in the art will appreciate that, in addition to implementing the systems, apparatus, and various modules thereof provided by the present invention in purely computer readable program code, the same procedures can be implemented entirely by logically programming method steps such that the systems, apparatus, and various modules thereof are provided in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system, the device and the modules thereof provided by the present invention can be considered as a hardware component, and the modules included in the system, the device and the modules thereof for implementing various programs can also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.
Claims (10)
1. An unsupervised domain adaptation method based on antagonistic learning, comprising:
a characteristic extraction step: extracting the characteristics of the images in the source field and the target field by using a characteristic extraction network to obtain the image characteristics of the source field and the image characteristics of the target field;
a category prediction step: predicting the probability of the image belonging to each category according to the obtained image characteristics of the source field and the target field to obtain category prediction probability;
a field discrimination step: according to the obtained image features of the source field and the image features of the target field, the probability that the network prediction image features come from the source field and the target field is judged through the field, and a field prediction probability is obtained;
and a counterstudy step: designing a loss function for the obtained domain prediction probability, and enabling the feature extraction network and the domain discrimination network to perform counterstudy, so that the feature extraction network can extract domain invariant features;
a characteristic discrimination force improving step: for the obtained image features of the source field, improving the discrimination of the features by using a central loss function;
and (3) a conditional probability alignment step: and performing conditional probability alignment on the obtained image features of the source field and the target field according to the obtained class prediction probability.
2. The unsupervised domain adaptation method based on antagonistic learning according to claim 1, characterized in that said feature extraction step:
inputting the images of the source field and the images of the target field into a feature extraction network by utilizing the feature extraction network, and extracting the image features of the source field and the image features of the target field;
the feature extraction network is a deep convolutional neural network;
the source field image and the target field image are from two different distributed images aiming at the same classification task, the source field image is correspondingly labeled, and the target field image is not labeled information.
3. The unsupervised domain adaptation method based on antagonistic learning according to claim 2, characterized in that said class prediction step:
predicting the probability of each category by using a classification network consisting of a full connection layer and a softmax layer according to the obtained image characteristics of the source field and the target field to obtain a category prediction probability;
the category refers to a category of an event contained in the image;
the category predicting step includes:
and a probability calculation step: denote the feature extraction network by E, E (x)i) Image x representing feature extraction network extractioniThe dimension is N dimension, C represents a classification network formed by the full connection layer, a total of K classes are preset, the parameter of the full connection layer is an N × K dimension matrix, which is marked as W, and the output of the full connection layer is:
C(E(xi))=WTE(xi)
wherein,
E(xi) Image x representing feature extraction network extractioniThe features of (1);
C(E(xi) C represents the input E (x) of a classification network consisting of fully connected layersi) The output obtained by the rear full link layer;
superscript T denotes transpose;
w represents a matrix with the parameter of the full connection layer being N multiplied by K dimension;
converting the output of the full connection layer into an image x through the softmax layeriProbability for each class, where image xiThe probability of being of class k is:
wherein,
Pk(xi) Representing an image xiProbability of class k;
e represents a natural constant;
[WTE(xi)]kis WTE(xi) A value of the k-th dimension;
after calculating the probability of each image category, the image x can be obtainediThe prediction class (2) of (1), i.e., the class with the highest prediction probability, is as follows:
wherein,
representing an image xiA category prediction tag of (1);
a step of classification network learning: for the image of the marked source field, the image x obtained in the probability calculation stepiFor each class of probabilities compared to the corresponding label, the following classification network loss function can be calculated:
wherein L issRepresenting a class prediction loss function;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEAnd a parameter θ of the class prediction networkCMinimizing class prediction loss;
θEparameters representing a feature extraction network;
θCparameters representing a class prediction network;
(Xs,Ys) A distribution of images and labels representing a source domain;
P(xi) Representing an image xiProbabilities for each category;
xian image representing a source domain;
yithe label of the expression category is in the form of one-hot vector, namely the label of the kth class is a K-dimensional vector with the kth dimension being 1 and the other dimensions being 0;
h represents a cross entropy function;
and learning the classification network C and the feature extraction network E according to the obtained classification network loss function, obtaining the learned classification network C and the feature extraction network E, and returning to the probability calculation step for continuous execution.
4. The unsupervised domain adaptation method based on antagonistic learning according to claim 3, characterized in that said domain discriminating step:
image x extracted by feature extraction network EiIs characterized by E (x)i) The dimension is N, D represents a domain discrimination network composed of fully connected layers, and the output is D (E (x)i) Let D (E (x))i) Has dimension 1, is converted to [0,1 ] by sigmoid function h]Interval, h (D (E (x))i) ))) represents an image xiProbability from the source domain, then 1-h (D (E (x)i) ) represents the probability that the image is from the target domain, wherein the sigmoid function can be expressed as:
wherein,
h(D(E(xi) ))) represents an image xiProbability from source domain;
D(E(xi) Represents the output of a domain discrimination network composed of fully connected layers.
The antagonistic learning step:
according to the obtained domain prediction probability, adopting a counterstudy target function to carry out counterstudy on the feature extraction network and the domain discrimination network, enabling the domain discrimination network to distinguish the image of the source domain and the image of the target domain as much as possible, enabling the feature extraction network to extract the invariant features of the domain, thereby confusing the domain discrimination network and enabling the domain discrimination network to carry out misjudgment, even if the domain discrimination network cannot distinguish whether the image features are from the source domain or the target domain;
the domain invariant features refer to image semantic information shared by a source domain and a target domain;
the confrontation learning objective function refers to a feature extraction network minimized confrontation loss function and a domain discrimination network maximized confrontation loss function, and is as follows:
wherein,
Ladvrepresenting a resistance loss function;
the optimization target of the expression feature extraction network E is a minimized countermeasure loss function, and the optimization target of the domain discrimination network D is a maximized countermeasure loss function;
Xsa set of samples representing a source domain;
Xta set of samples representing a target domain;
θDis a parameter of the domain discrimination network.
5. The unsupervised domain adaptation method based on antagonistic learning according to claim 4, characterized in that said feature discrimination power boosting step:
setting a class center point for each class, and adding a center loss function to enable the image characteristics of the source field to be close to the center point of the corresponding class, so that samples scattered in inter-class areas are reduced, and the discrimination of the extracted characteristics is improved;
the center loss function: by calculating the euclidean distance of each feature from the center point of the corresponding class, the following center loss function can be obtained:
wherein,
a central loss function representing a minimized source domain;
Lcsa central loss function representing the source domain, which is associated with the feature extraction network E;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the source domain;
cyiis class yiA class center point of (1);
during the first iteration, the center points of each category are initialized by using the data of the current batch, and then the center points are updated in the following way:
wherein,
is the center point of the kth class at the t +1 th iteration;
representing the center point of the kth category at the time of the t-th iteration;
γ is the update rate of the class center point;
k is the number of categories;
representing a center pointThe calculation formula of (2) is as follows:
wherein,
Btis the batch data at the t-th iteration;
i (.) represents the indicator function when yiWhen k is true, I (y)iK) 1, otherwise 0;
Nkis the number of class k samples in the batch.
6. The unsupervised domain adaptation method based on antagonistic learning according to claim 5, characterized in that said conditional probability aligning step:
predicting labels according to the obtained categories of the target domain imagesDesigning a central loss function of a minimized target field to align the class conditional probabilities P (X | Y) of the images of the source field and the target field, so that the images of the target field are close to the corresponding class centers, thereby aligning the distribution of the two fields and enabling the image characteristics of the target field without annotation to have discriminative power;
the minimization of the central loss function of the target domain is expressed as follows:
wherein,
a central loss function representing a minimization target area;
Lctrepresenting a center loss function of the target field obtained by calculating Euclidean distances between the sample characteristics of the target field and the corresponding class center points;
the optimization objectives for the representation network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the target area;
representing categoriesA class center point of (1);
Φ(Xt) Is a subset of the target domain, where the samples satisfy the following condition:
Φ(Xt)={xi∈Xtand max(p(xi))>T}
wherein,
p(xi) Represents a K-dimensional vector whose K-th dimension represents a sample xiProbability of being class k;
t represents a threshold value, and a predictive tag is only trusted if its probability is greater than this threshold value.
7. An unsupervised domain adaptation system based on antagonistic learning, comprising:
a feature extraction module: extracting the characteristics of the images in the source field and the target field by using a characteristic extraction network to obtain the image characteristics of the source field and the image characteristics of the target field;
a category prediction module: predicting the probability of the image belonging to each category according to the obtained image characteristics of the source field and the target field to obtain category prediction probability;
a domain discrimination module: according to the obtained image features of the source field and the image features of the target field, the probability that the network prediction image features come from the source field and the target field is judged through the field, and a field prediction probability is obtained;
the confrontation learning module: designing a loss function for the obtained domain prediction probability, and enabling the feature extraction network and the domain discrimination network to perform counterstudy, so that the feature extraction network can extract domain invariant features;
a feature discrimination power boosting module: for the obtained image features of the source field, improving the discrimination of the features by using a central loss function;
a conditional probability alignment module: and performing conditional probability alignment on the obtained image features of the source field and the target field according to the obtained class prediction probability.
8. The supervised-domain adaptation system for antagonistic learning based on claim 7, wherein the feature extraction module:
inputting the images of the source field and the images of the target field into a feature extraction network by utilizing the feature extraction network, and extracting the image features of the source field and the image features of the target field;
the feature extraction network is a deep convolutional neural network;
the source field image and the target field image are from two different distributed images aiming at the same classification task, the source field image is correspondingly labeled, and the target field image is not labeled information;
the category prediction module:
predicting the probability of each category by using a classification network consisting of a full connection layer and a softmax layer according to the obtained image characteristics of the source field and the target field to obtain a category prediction probability;
the category refers to a category of an event contained in the image;
the category prediction module includes:
a probability calculation module: denote the feature extraction network by E, E (x)i) Image x representing feature extraction network extractioniThe dimension is N dimension, C represents a classification network formed by the full connection layer, a total of K classes are preset, the parameter of the full connection layer is an N × K dimension matrix, which is marked as W, and the output of the full connection layer is:
C(E(xi))=WTE(xi)
wherein,
E(xi) Image x representing feature extraction network extractioniThe features of (1);
C(E(xi) C represents the input E (x) of a classification network consisting of fully connected layersi) The output obtained by the rear full link layer;
superscript T denotes transpose;
w represents a matrix with the parameter of the full connection layer being N multiplied by K dimension;
converting the output of the full connection layer into an image x through the softmax layeriProbability for each class, where image xiThe probability of being of class k is:
wherein,
Pk(xi) Representing an image xiProbability of class k;
e represents a natural constant;
[WTE(xi)]kis WTE(xi) A value of the k-th dimension;
after calculating the probability of each image category, the image x can be obtainediThe prediction class (2) of (1), i.e., the class with the highest prediction probability, is as follows:
wherein,
representing an image xiA category prediction tag of (1);
the classification network learning module: for the image of the marked source field, the image x obtained by the probability calculation module is usediFor each class of probabilities compared to the corresponding label, the following classification network loss function can be calculated:
wherein L issRepresenting a class prediction loss function;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEAnd a parameter θ of the class prediction networkCMinimizing class prediction loss;
θEparameters representing a feature extraction network;
θCparameters representing a class prediction network;
(Xs,Ys) A distribution of images and labels representing a source domain;
P(xi) Representing an image xiProbabilities for each category;
xian image representing a source domain;
yithe label of the expression category is in the form of one-hot vector, namely the label of the kth class is a K-dimensional vector with the kth dimension being 1 and the other dimensions being 0;
h represents a cross entropy function;
and learning the classification network C and the feature extraction network E according to the obtained classification network loss function, obtaining the learned classification network C and the feature extraction network E, and returning to the probability calculation module for continuous execution.
9. The unsupervised domain adaptation system based on antagonistic learning according to claim 8, characterized in that the domain discrimination module:
image x extracted by feature extraction network EiIs characterized by E (x)i) The dimension is N, D represents a domain discrimination network composed of fully connected layers, and the output is D (E (x)i) Let D (E (x))i) Has dimension 1, is converted to [0,1 ] by sigmoid function h]Interval, h (D (E (x))i) ))) represents an image xiProbability from the source domain, then 1-h (D (E (x)i) ) represents the probability that the image is from the target domain, wherein the sigmoid function can be expressed as:
wherein,
h(D(E(xi) ))) represents an image xiProbability from source domain;
D(E(xi) Output of a domain discrimination network composed of all-connected layers;
the confrontation learning module:
according to the obtained domain prediction probability, adopting a counterstudy target function to carry out counterstudy on the feature extraction network and the domain discrimination network, enabling the domain discrimination network to distinguish the image of the source domain and the image of the target domain as much as possible, enabling the feature extraction network to extract the invariant features of the domain, thereby confusing the domain discrimination network and enabling the domain discrimination network to carry out misjudgment, even if the domain discrimination network cannot distinguish whether the image features are from the source domain or the target domain;
the domain invariant features refer to image semantic information shared by a source domain and a target domain;
the confrontation learning objective function refers to a feature extraction network minimized confrontation loss function and a domain discrimination network maximized confrontation loss function, and is as follows:
wherein,
Ladvrepresenting a resistance loss function;
the optimization target of the expression feature extraction network E is a minimized countermeasure loss function, and the optimization target of the domain discrimination network D is a maximized countermeasure loss function;
Xsa set of samples representing a source domain;
Xta set of samples representing a target domain;
θDis a field judgmentParameters of other networks;
the feature discrimination force improving module:
setting a class center point for each class, and adding a center loss function to enable the image characteristics of the source field to be close to the center point of the corresponding class, so that samples scattered in inter-class areas are reduced, and the discrimination of the extracted characteristics is improved;
the center loss function: by calculating the euclidean distance of each feature from the center point of the corresponding class, the following center loss function can be obtained:
wherein,
a central loss function representing a minimized source domain;
Lcsa central loss function representing the source domain, which is associated with the feature extraction network E;
the training targets representing the network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the source domain;
cyiis class yiA class center point of (1);
during the first iteration, the center points of each category are initialized by using the data of the current batch, and then the center points are updated in the following way:
wherein,
is the kth class at the t +1 th iterationA center point of (a);
representing the center point of the kth category at the time of the t-th iteration;
γ is the update rate of the class center point;
k is the number of categories;
representing a center pointThe calculation formula of (2) is as follows:
wherein,
Btis the batch data at the t-th iteration;
i (.) represents the indicator function when yiWhen k is true, I (y)iK) 1, otherwise 0;
Nkis the number of class k samples in the batch;
the conditional probability alignment module:
predicting labels according to the obtained categories of the target domain imagesDesigning a central loss function of a minimized target field to align the class conditional probabilities P (X | Y) of the images of the source field and the target field, so that the images of the target field are close to the corresponding class centers, thereby aligning the distribution of the two fields and enabling the image characteristics of the target field without annotation to have discriminative power;
the minimization of the central loss function of the target domain is expressed as follows:
wherein,
a central loss function representing a minimization target area;
Lctrepresenting a center loss function of the target field obtained by calculating Euclidean distances between the sample characteristics of the target field and the corresponding class center points;
the optimization objectives for the representation network are: optimizing a parameter θ of a feature extraction networkEMinimizing the central loss of the target area;
representing categoriesA class center point of (1);
Φ(Xt) Is a subset of the target domain, where the samples satisfy the following condition:
Φ(Xt)={xi∈Xtand max(p(xi))>T}
wherein,
p(xi) Represents a K-dimensional vector whose K-th dimension represents a sample xiProbability of being class k;
t represents a threshold value, and a predictive tag is only trusted if its probability is greater than this threshold value.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of the unsupervised domain adaptation method based on opponent learning of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910276847.5A CN110135579A (en) | 2019-04-08 | 2019-04-08 | Unsupervised field adaptive method, system and medium based on confrontation study |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910276847.5A CN110135579A (en) | 2019-04-08 | 2019-04-08 | Unsupervised field adaptive method, system and medium based on confrontation study |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110135579A true CN110135579A (en) | 2019-08-16 |
Family
ID=67569292
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910276847.5A Pending CN110135579A (en) | 2019-04-08 | 2019-04-08 | Unsupervised field adaptive method, system and medium based on confrontation study |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110135579A (en) |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110717426A (en) * | 2019-09-27 | 2020-01-21 | 卓尔智联(武汉)研究院有限公司 | Garbage classification method based on domain adaptive learning, electronic equipment and storage medium |
CN110837850A (en) * | 2019-10-23 | 2020-02-25 | 浙江大学 | Unsupervised domain adaptation method based on counterstudy loss function |
CN111091127A (en) * | 2019-12-16 | 2020-05-01 | 腾讯科技(深圳)有限公司 | Image detection method, network model training method and related device |
CN111179254A (en) * | 2019-12-31 | 2020-05-19 | 复旦大学 | Domain-adaptive medical image segmentation method based on feature function and counterstudy |
CN111259941A (en) * | 2020-01-10 | 2020-06-09 | 中国科学院计算技术研究所 | Cross-domain image classification method and system based on fine-grained domain self-adaption |
CN111368690A (en) * | 2020-02-28 | 2020-07-03 | 珠海大横琴科技发展有限公司 | Deep learning-based video image ship detection method and system under influence of sea waves |
CN111680622A (en) * | 2020-06-05 | 2020-09-18 | 上海一由科技有限公司 | Identity recognition method based on fostering environment |
CN111738315A (en) * | 2020-06-10 | 2020-10-02 | 西安电子科技大学 | Image classification method based on countermeasure fusion multi-source transfer learning |
CN112150407A (en) * | 2019-10-30 | 2020-12-29 | 重庆大学 | Deep learning detection method and system for inclusion defect of aerospace composite material of small sample |
CN112149689A (en) * | 2020-09-28 | 2020-12-29 | 上海交通大学 | Unsupervised domain adaptation method and system based on target domain self-supervised learning |
CN112232293A (en) * | 2020-11-09 | 2021-01-15 | 腾讯科技(深圳)有限公司 | Image processing model training method, image processing method and related equipment |
CN112270367A (en) * | 2020-11-05 | 2021-01-26 | 四川大学 | Semantic information-based method for enhancing robustness of deep learning model |
CN112308158A (en) * | 2020-11-05 | 2021-02-02 | 电子科技大学 | Multi-source field self-adaptive model and method based on partial feature alignment |
CN112446239A (en) * | 2019-08-29 | 2021-03-05 | 株式会社理光 | Neural network training and target detection method, device and storage medium |
CN112990387A (en) * | 2021-05-17 | 2021-06-18 | 腾讯科技(深圳)有限公司 | Model optimization method, related device and storage medium |
CN113065662A (en) * | 2020-01-02 | 2021-07-02 | 阿里巴巴集团控股有限公司 | Data processing method, self-learning system and electronic equipment |
CN113190733A (en) * | 2021-04-27 | 2021-07-30 | 中国科学院计算技术研究所 | Network event popularity prediction method and system based on multiple platforms |
CN113298189A (en) * | 2021-06-30 | 2021-08-24 | 广东工业大学 | Cross-domain image classification method based on unsupervised domain self-adaption |
CN113344044A (en) * | 2021-05-21 | 2021-09-03 | 北京工业大学 | Cross-species medical image classification method based on domain self-adaptation |
CN113361467A (en) * | 2021-06-30 | 2021-09-07 | 电子科技大学 | License plate recognition method based on field adaptation |
CN113688867A (en) * | 2021-07-20 | 2021-11-23 | 广东工业大学 | Cross-domain image classification method |
CN113887534A (en) * | 2021-12-03 | 2022-01-04 | 腾讯科技(深圳)有限公司 | Determination method of object detection model and related device |
WO2022193628A1 (en) * | 2021-03-15 | 2022-09-22 | 华南理工大学 | Colon lesion intelligent recognition method and system based on unsupervised transfer picture classification, and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108009633A (en) * | 2017-12-15 | 2018-05-08 | 清华大学 | A kind of Multi net voting towards cross-cutting intellectual analysis resists learning method and system |
CN108053030A (en) * | 2017-12-15 | 2018-05-18 | 清华大学 | A kind of transfer learning method and system of Opening field |
CN108921281A (en) * | 2018-05-08 | 2018-11-30 | 中国矿业大学 | A kind of field adaptation method based on depth network and countermeasure techniques |
-
2019
- 2019-04-08 CN CN201910276847.5A patent/CN110135579A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108009633A (en) * | 2017-12-15 | 2018-05-08 | 清华大学 | A kind of Multi net voting towards cross-cutting intellectual analysis resists learning method and system |
CN108053030A (en) * | 2017-12-15 | 2018-05-18 | 清华大学 | A kind of transfer learning method and system of Opening field |
CN108921281A (en) * | 2018-05-08 | 2018-11-30 | 中国矿业大学 | A kind of field adaptation method based on depth network and countermeasure techniques |
Non-Patent Citations (1)
Title |
---|
YEXUN ZHANG 等: "Domain-Invariant Adversarial Learning for Unsupervised Domain Adaption", 《ARXIV:1811.12751V1 [CS.CV]》 * |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112446239A (en) * | 2019-08-29 | 2021-03-05 | 株式会社理光 | Neural network training and target detection method, device and storage medium |
CN110717426A (en) * | 2019-09-27 | 2020-01-21 | 卓尔智联(武汉)研究院有限公司 | Garbage classification method based on domain adaptive learning, electronic equipment and storage medium |
CN110837850A (en) * | 2019-10-23 | 2020-02-25 | 浙江大学 | Unsupervised domain adaptation method based on counterstudy loss function |
CN110837850B (en) * | 2019-10-23 | 2022-06-21 | 浙江大学 | Unsupervised domain adaptation method based on counterstudy loss function |
CN112150407A (en) * | 2019-10-30 | 2020-12-29 | 重庆大学 | Deep learning detection method and system for inclusion defect of aerospace composite material of small sample |
CN112150407B (en) * | 2019-10-30 | 2022-09-30 | 重庆大学 | Deep learning detection method and system for aerospace composite material inclusion defect of small sample |
CN111091127A (en) * | 2019-12-16 | 2020-05-01 | 腾讯科技(深圳)有限公司 | Image detection method, network model training method and related device |
CN111179254A (en) * | 2019-12-31 | 2020-05-19 | 复旦大学 | Domain-adaptive medical image segmentation method based on feature function and counterstudy |
CN111179254B (en) * | 2019-12-31 | 2023-05-30 | 复旦大学 | Domain adaptive medical image segmentation method based on feature function and countermeasure learning |
CN113065662A (en) * | 2020-01-02 | 2021-07-02 | 阿里巴巴集团控股有限公司 | Data processing method, self-learning system and electronic equipment |
CN111259941A (en) * | 2020-01-10 | 2020-06-09 | 中国科学院计算技术研究所 | Cross-domain image classification method and system based on fine-grained domain self-adaption |
CN111259941B (en) * | 2020-01-10 | 2023-09-26 | 中国科学院计算技术研究所 | Cross-domain image classification method and system based on fine granularity domain self-adaption |
CN111368690A (en) * | 2020-02-28 | 2020-07-03 | 珠海大横琴科技发展有限公司 | Deep learning-based video image ship detection method and system under influence of sea waves |
CN111680622B (en) * | 2020-06-05 | 2023-08-01 | 上海一由科技有限公司 | Identity recognition method based on supporting environment |
CN111680622A (en) * | 2020-06-05 | 2020-09-18 | 上海一由科技有限公司 | Identity recognition method based on fostering environment |
CN111738315A (en) * | 2020-06-10 | 2020-10-02 | 西安电子科技大学 | Image classification method based on countermeasure fusion multi-source transfer learning |
CN112149689A (en) * | 2020-09-28 | 2020-12-29 | 上海交通大学 | Unsupervised domain adaptation method and system based on target domain self-supervised learning |
CN112149689B (en) * | 2020-09-28 | 2022-12-09 | 上海交通大学 | Unsupervised domain adaptation method and system based on target domain self-supervised learning |
CN112308158A (en) * | 2020-11-05 | 2021-02-02 | 电子科技大学 | Multi-source field self-adaptive model and method based on partial feature alignment |
CN112308158B (en) * | 2020-11-05 | 2021-09-24 | 电子科技大学 | Multi-source field self-adaptive model and method based on partial feature alignment |
CN112270367A (en) * | 2020-11-05 | 2021-01-26 | 四川大学 | Semantic information-based method for enhancing robustness of deep learning model |
CN112232293A (en) * | 2020-11-09 | 2021-01-15 | 腾讯科技(深圳)有限公司 | Image processing model training method, image processing method and related equipment |
WO2022193628A1 (en) * | 2021-03-15 | 2022-09-22 | 华南理工大学 | Colon lesion intelligent recognition method and system based on unsupervised transfer picture classification, and medium |
CN113190733B (en) * | 2021-04-27 | 2023-09-12 | 中国科学院计算技术研究所 | Network event popularity prediction method and system based on multiple platforms |
CN113190733A (en) * | 2021-04-27 | 2021-07-30 | 中国科学院计算技术研究所 | Network event popularity prediction method and system based on multiple platforms |
CN112990387A (en) * | 2021-05-17 | 2021-06-18 | 腾讯科技(深圳)有限公司 | Model optimization method, related device and storage medium |
CN112990387B (en) * | 2021-05-17 | 2021-07-20 | 腾讯科技(深圳)有限公司 | Model optimization method, related device and storage medium |
CN113344044A (en) * | 2021-05-21 | 2021-09-03 | 北京工业大学 | Cross-species medical image classification method based on domain self-adaptation |
CN113344044B (en) * | 2021-05-21 | 2024-05-28 | 北京工业大学 | Cross-species medical image classification method based on field self-adaption |
CN113298189A (en) * | 2021-06-30 | 2021-08-24 | 广东工业大学 | Cross-domain image classification method based on unsupervised domain self-adaption |
CN113298189B (en) * | 2021-06-30 | 2023-07-07 | 广东工业大学 | Cross-domain image classification method based on unsupervised domain self-adaption |
CN113361467A (en) * | 2021-06-30 | 2021-09-07 | 电子科技大学 | License plate recognition method based on field adaptation |
CN113688867A (en) * | 2021-07-20 | 2021-11-23 | 广东工业大学 | Cross-domain image classification method |
CN113887534A (en) * | 2021-12-03 | 2022-01-04 | 腾讯科技(深圳)有限公司 | Determination method of object detection model and related device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110135579A (en) | Unsupervised field adaptive method, system and medium based on confrontation study | |
Winkens et al. | Contrastive training for improved out-of-distribution detection | |
CN113076994B (en) | Open-set domain self-adaptive image classification method and system | |
Dara et al. | Clustering unlabeled data with SOMs improves classification of labeled real-world data | |
Tarawneh et al. | Invoice classification using deep features and machine learning techniques | |
CN113657425A (en) | Multi-label image classification method based on multi-scale and cross-modal attention mechanism | |
Feng et al. | Transductive multi-instance multi-label learning algorithm with application to automatic image annotation | |
Tax et al. | Bag dissimilarities for multiple instance learning | |
CN111898704B (en) | Method and device for clustering content samples | |
Xu et al. | Adaptively denoising proposal collection for weakly supervised object localization | |
CN112734037A (en) | Memory-guidance-based weakly supervised learning method, computer device and storage medium | |
CN115797701A (en) | Target classification method and device, electronic equipment and storage medium | |
Jiang et al. | Dynamic proposal sampling for weakly supervised object detection | |
CN111191033A (en) | Open set classification method based on classification utility | |
Aljundi et al. | Identifying wrongly predicted samples: A method for active learning | |
CN113177554A (en) | Thyroid nodule identification and segmentation method, system, storage medium and equipment | |
CN115565001A (en) | Active learning method based on maximum average difference antagonism | |
CN112257787B (en) | Image semi-supervised classification method based on generation type dual-condition confrontation network structure | |
Meena Deshpande | License plate detection and recognition using yolo v4 | |
Lu et al. | Large Class Separation is not what you need for Relational Reasoning-based OOD Detection | |
CN113158878A (en) | Heterogeneous migration fault diagnosis method, system and model based on subspace | |
Jiang et al. | Learning from noisy labels with noise modeling network | |
Li et al. | Learning common and label-specific features for multi-label classification with missing labels | |
Feng et al. | Adaptive all-season image tag ranking by saliency-driven image pre-classification | |
WO2024207311A1 (en) | Method and apparatus for quantifying sample difficulty based on pre-trained models |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190816 |
|
RJ01 | Rejection of invention patent application after publication |