CN113378904A - Image classification method based on anti-domain adaptive network - Google Patents

Image classification method based on anti-domain adaptive network Download PDF

Info

Publication number
CN113378904A
CN113378904A CN202110607513.9A CN202110607513A CN113378904A CN 113378904 A CN113378904 A CN 113378904A CN 202110607513 A CN202110607513 A CN 202110607513A CN 113378904 A CN113378904 A CN 113378904A
Authority
CN
China
Prior art keywords
domain
image
network
source domain
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110607513.9A
Other languages
Chinese (zh)
Other versions
CN113378904B (en
Inventor
贾龙飞
李晶晶
杜哲凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xinchen (Chongqing) Microelectronics Co.,Ltd.
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202110607513.9A priority Critical patent/CN113378904B/en
Publication of CN113378904A publication Critical patent/CN113378904A/en
Application granted granted Critical
Publication of CN113378904B publication Critical patent/CN113378904B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image classification method based on an antagonistic domain self-adaptive network, which redefines an antagonistic loss function LadvThe problem of source domain and target domain generalization is overcome, but redefining the loss-resistant function is costly, and the source domain and target domain distributions cannot be guaranteed to be close by only optimizing a new ADAN target function, i.e. the loss-resistant function, so the invention adopts the method of minimizing the additional divergence, i.e. the metric loss function LmetricTo align the source domain and the target domain. By adopting the antagonistic learning and the metric learning, on one hand, the generalization problem in the prior ADAN is challenged, and on the other hand, the domain divergence is ensured to be minimized in the training process. Experiments show that the method can be well suitable for the unsupervised field self-adaptive task, and the classification performance (accuracy rate) of the target domain image is improved.

Description

Image classification method based on anti-domain adaptive network
Technical Field
The invention belongs to the technical field of image classification, and particularly relates to an image classification method based on an anti-domain adaptive network.
Background
One common assumption behind most machine learning models is: the source domain and the target domain have the same data distribution. However, this common assumption cannot be guaranteed in real-world applications, which may result in a drastic degradation of the performance of classifying the target domain data. Therefore, domain adaptation is proposed to solve this problem by reducing domain differences. The latest Domain Adaptation methods are based on Adversarial learning, and these methods are generally called Adversarial Domain Adaptation Networks (ADAN).
Fig. 1 is a schematic diagram illustrating the principle of an image classification method based on a domain-robust adaptive network.
ADAN is similar to a generative confrontation network (GANS) that trains a feature representation network F (similar to the producers in a generative confrontation network) and a domain arbiter D in a confrontational manner. Specifically, in the image classification method based on the anti-domain adaptive network, images of a source domain and an object domain are respectively input into a feature representation network F for feature extraction to obtain image features, and then the image features are respectively sent into a domain discriminator D, and once the domain discriminator D cannot distinguish whether the image features are from the source domain or the object domain, the learned image features are considered to be unchanged. Thus, the feature representation network F and the classifier C are trained through the source domain image, and the trained feature representation network F and the classifier C can be suitable for classification of the target domain image.
In domain adaptation, the source domain and the target domain have different data distributions, and the goal of domain adaptation is to learn a new feature representation so that the source domain and the target domain can be well aligned. ADAN takes advantage of the idea of adversarial learning, which assumes that the two domains are aligned as long as the domain discriminators are confused. However, recent advances have shown that such assumptions may not be reliable. ADAN, however, inherits the shortcomings of GANS, and even if training is successful, the learned distribution may be far from the expected distribution, which is referred to as the generalization problem in GANS. Therefore, even if the domain discriminator is successfully confused, it cannot be guaranteed that the learned representation is domain-invariant, that is, the trained feature representation network F and the classifier C cannot well classify the target domain image, and the classification performance needs to be improved.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides an image classification method based on an anti-domain adaptive network, overcomes the generalization of a source domain and a target domain, aligns the source domain and the target domain, and improves the classification performance (accuracy rate) of images in the target domain.
In order to achieve the above object, the present invention provides an image classification method based on an anti-domain adaptive network, comprising the steps of:
(1) construction of an adaptive network of the anti-domain for image classification
Constructing a feature representation network F, a domain discriminator D and a classifier C aiming at images in a source domain and a target domain;
wherein the feature representation network F is used for the image x in the source domainsCarrying out feature extraction to obtain source domain image features F (x)s) For image x in the target domaintCarrying out feature extraction to obtain the image feature F (x) of the target domaint) The domain discriminator D is used for the source domain image feature F (x)s) And target domain image features F (x)t) The probability D (F (x)) of the source domain is obtained by discriminations) And D (F (x))t) Classifier C is used to classify the source domain image features F (x)s) Classifying to obtain image classification probability pi(softmax(F(xs) I ═ 1,2, … I, for the target domain image feature F (x)t) Classifying to obtain image classification probability pi(softmax(F(xt)),i=1,2,…I;
(2) Training of an adaptive network of the antagonistic domain
2.1), extracting a batch of M images from the source domain, denoted xs_1,xs_2,...,xs_MExtracting a batch of N images, denoted x, from the target domaint_1,xt_2,...,xt_NRespectively inputting the image features into a feature representation network F to obtain M source domain image features:
F(xs_1),F(xs_2),...,F(xs_M);
and N target domain image features:
F(xt_1),F(xt_2),...,F(xt_N);
m source domain graphsImage feature F (x)s_1),F(xs_2),...,F(xs_M) And N target domain image features F (x)t_1),F(xt_2),...,F(xt_N) Respectively inputting the data into a domain discriminator D to obtain the probability of belonging to a source domain:
D(F(xs_1)),D(F(xs_2)),...,D(F(xs_M))
D(F(xt_1)),D(F(xt_2)),...,D(F(xt_N))
m source domain image features F (x)s_1),F(xs_2),...,F(xs_M) Respectively inputting the image data into a classifier C to obtain image classification probability:
pi(softmax(F(xs_1))),pi(softmax(F(xs_2))),…,pi(softmax(F(xs_M))),i=1,2,…,I;
wherein I is the number of the image types, and I is the number of the image types;
2.2), calculating the overall target function L of training:
L=minFmaxDLadv+λminFLmetric+βminCLcls
wherein:
Figure BDA0003094482690000031
Lmetric=E{k(F(xs_m),F(xs_m′))}m=1,2,…,M,m′=1,2,…,M,m≠m′ +E{k(F(xt_n),F(xt_n′))}n=1,2,…,N,n′=1,2,…,N,n≠n′ -2E{k(F(xs_m),F(xt_n))}m=1,2,…,M,n=1,2,…,N
Figure BDA0003094482690000033
wherein, minFmaxDLadvOf (1) containsMeaning as follows: by updating the network parameters characterizing the network F such that the penalty function L is opposedadvAt a minimum, the network parameters of the arbiter D are updated such that the penalty function L is opposedadvMaximum, formation of antagonistic training, minFLmetricThe meaning of (A) is: by updating the network parameters characterizing the network F such that the metric loss function LmetricMin is minimumCLclsThe meaning of (A) is: the network parameters of the network F are represented by updating features such that the cross entropy loss function LmetricAt minimum, λ, β are equilibrium parameters greater than 0;
wherein E represents the expectation value of all element values in the equation, k (,) is a Gaussian kernel function,
Figure BDA0003094482690000034
to indicate a function, when i ═ ys_mWhen it is 1, the rest is 0, ys_mAs source domain image xs_mTrue class (tag value), pi(softmax(F(xs_m) ()) represents the source domain image features F (x) of the classifier outputs_m) Probability of belonging to class i;
(3) image classification
Inputting an image of an unknown class from a source domain or a target domain into a feature representation network F for feature extraction to obtain image features, and then sending the image features into a classifier C to obtain probabilities belonging to each class, wherein the class corresponding to the maximum probability value is the class of the input image.
The invention aims to realize the following steps:
the invention discloses an image classification method based on an anti-domain adaptive network, which has the following defects that ADAN inherits the GANS: in the field of image classification technology, even if training is successful, the distribution of the learned target domain may be far from the expected distribution, that is, there is a generalization problem and it cannot be guaranteed that the learned image features are domain-invariant. Aiming at the problem, the invention redefines the resistance loss function LadvOvercoming the problem of source and target domain generalization, but redefining the loss-immunity function is costly, and the source domain cannot be guaranteed by merely optimizing a new ADAN target function, i.e., the loss-immunity functionAnd the target domain distribution is close, the invention therefore uses minimizing the additional divergence, i.e. the metric loss function LmetricTo align the source domain and the target domain. By adopting the antagonistic learning and the metric learning, on one hand, the generalization problem in the prior ADAN is challenged, and on the other hand, the domain divergence is ensured to be minimized in the training process. Experiments show that the method can be well suitable for the unsupervised field self-adaptive task, and the classification performance (accuracy rate) of the target domain image is improved.
Drawings
FIG. 1 is a schematic diagram of an image classification method based on a domain-robust adaptive network in the prior art;
FIG. 2 is a flowchart of an embodiment of an image classification method based on an anti-domain adaptive network according to the present invention;
FIG. 3 is a schematic diagram of an embodiment of an adaptive anti-domain network for image classification according to the present invention;
FIG. 4 is the results of the model analysis of the present invention, wherein (a) the parametric sensitivity of λ is given, (b) the test errors in different iterations are given, (c) the total loss value during training is given, and (d) the distribution difference of different features is given;
FIG. 5 is a visualization of t-SNE and learned representations of ablation studies, where (a), (b), and (c) visualize the original representation (non-adapted), CDAN representation, and invention representation, respectively, and the numbers near each cluster are the corresponding category labels, and (d) give the results of ablation studies, where w/o is an abbreviation for none and adv is an abbreviation for resistant learning.
Fig. 6 is a qualitative image classification result diagram.
Detailed Description
The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.
Since ADAN is similar to GANS, it also has a generalization problem. To this end, the invention is directed to ADANRestated to be suitable for image classification. But the reformulation is costly, e.g. by only optimizing the new ADAN anti-loss function LadvThere is no guarantee that the source domain and target domain distributions are close. Thus, the present invention further aligns the two domains by minimizing the additional divergence. Finally, supervised classification loss over the source domain is exploited to ensure the discriminativity of the learned image features.
FIG. 2 is a flowchart of an embodiment of an image classification method based on an anti-domain adaptive network.
In this embodiment, as shown in fig. 2, the image classification method based on the robust domain adaptive network of the present invention includes the following steps:
step S1: constructing an adaptive network of the anti-domain for image classification
As shown in fig. 3, a feature representation network F, a domain discriminator D and a classifier C are constructed for the images in the source domain and the target domain, wherein the feature representation network F is used for the image x in the source domainsCarrying out feature extraction to obtain source domain image features F (x)s) For image x in the target domaintCarrying out feature extraction to obtain the image feature F (x) of the target domaint) The domain discriminator D is used for the source domain image feature F (x)s) And target domain image features F (x)t) The probability D (F (x)) of the source domain is obtained by discriminations) And D (F (x))t) Classifier C is used to classify the source domain image features F (x)s) Classifying to obtain image classification probability pi(softmax(F(xs) I ═ 1,2, … I, for the target domain image feature F (x)t) Classifying to obtain image classification probability pi(softmax(F(xt)),i=1,2,…I。
The feature representation network F, the domain discriminator D and the classifier C in fig. 3 are all conventional image feature extraction networks, discriminators and classifiers, for example, the feature representation network F may use a common ResNet-50 residual neural network, and will not be described herein again.
Step S2: training of an adaptive network of the countermeasure domain
Step S2.1: set Batch Size (Batch Size) to M, extract one from the source domainBatch M images, denoted xs_1,xs_2,...,xs_MSetting the Batch Size (Batch Size) to N, extracting a Batch of N images from the target domain, denoted as xt_1,xt_2,...,xt_NRespectively inputting the image features into a feature representation network F to obtain M source domain image features:
F(xs_1),F(xs_2),...,F(xs_M);
and N target domain image features:
F(xt_1),F(xt_2),...,F(xt_N)。
m source domain image features F (x)s_1),F(xs_2),...,F(xs_M) And N target domain image features F (x)t_1),F(xt_2),...,F(xt_N) Respectively inputting the data into a domain discriminator D to obtain the probability of belonging to a source domain:
D(F(xs_1)),D(F(xs_2)),...,D(F(xs_M))
D(F(xt_1)),D(F(xt_2)),...,D(F(xt_N))
m source domain image features F (x)s_1),F(xs_2),...,F(xs_M) Respectively inputting the image data into a classifier C to obtain image classification probability:
pi(softmax(F(xs_1))),pi(softmax(F(xs_2))),…,pi(softmax(F(xs_M))),i=1,2,…,I;
wherein, I is the number of the image types, and I is the number of the image types.
Step S2.2: calculating a trained global objective function L:
L=minFmaxDLadv+λminFLmetric+βminCLcls
wherein:
Figure BDA0003094482690000061
Lmetric=E{k(F(xs_m),F(xs_m′))}m=1,2,…,M,m′=1,2,…,M,m≠m′ +E{k(F(xt_n),F(xt_n′))}n=1,2,…,N,n′=1,2,…,N,n≠n′ -2E{k(F(xs_m),F(xt_n))}m=1,2,…,M,n=1,2,…,N
Figure BDA0003094482690000063
wherein, minFmaxDLadvThe meaning of (A) is: by updating the network parameters characterizing the network F such that the penalty function L is opposedadvAt a minimum, the network parameters of the arbiter D are updated such that the penalty function L is opposedadvMaximum, formation of antagonistic training, minFLmetricThe meaning of (A) is: by updating the network parameters characterizing the network F such that the metric loss function LmetricMin is minimumCLclsThe meaning of (A) is: the network parameters of the network F are represented by updating features such that the cross entropy loss function LmetricAt minimum, λ, β are equilibrium parameters greater than 0.
Wherein E represents the expectation value of all element values in the equation, k (,) is a Gaussian kernel function,
Figure BDA0003094482690000065
to indicate a function, when i ═ ys_mWhen it is 1, the rest is 0, ys_mAs source domain image xs_mTrue class (tag value), pi(softimax(F(xs_m) ()) represents the source domain image features F (x) of the classifier outputs_m) Probability of belonging to class i.
In the invention, aiming at the generalization problem of ADAN, inspired by F divergence, the invention provides a new loss-resisting function LadvNamely:
Figure BDA0003094482690000064
penalty function LadvDistance F, while it overcomes the generalization problem, is costly. In particular, the F distance may be small even if the two source domain distributions P and the target domain distribution Q are not very close. Such a cost is unacceptable in the task of domain adaptation, which destroys the goal of domain adaptation, i.e. aligning the source domain distribution P and the target domain distribution Q. Therefore, in the invention, a measurement loss function is additionally optimized to measure the divergence of the source domain distribution P and the target domain distribution Q, so that the generalization distance and the distribution divergence can be ensured.
Step S3: image classification
Inputting an image of an unknown class from a source domain or a target domain into a feature representation network F for feature extraction to obtain image features, and then sending the image features into a classifier C to obtain probabilities belonging to each class, wherein the class corresponding to the maximum probability value is the class of the input image.
Experimental verification
1. Introduction to data set
MNIST, USPS and SVHN are three widely used handwritten digit data sets. Specifically, the MNIST dataset consists of 6 ten thousand training images and 1 ten thousand test images, and is 28 × 28 in size. There are 7,291 16 x 16 training images and 2,007 test images of the same size in the USPS. SVHN is an abbreviation for street view house number and is more challenging because it has different backgrounds in different images, containing more than 60 ten thousand label numbers cropped from the street view image. The number of categories in each digital data set is 10, i.e. from 0 to 9. In this experiment, we used the same settings as in the existing image classification method in order to make a fair comparison.
The data set Office-31 includes 4,652 images of 31 categories that are common in Office settings, such as calculators, projectors, and printers. The image of the data set Office-31 is from three different subsets, namely amazon (a), digital single lens (D), and webcam (W). Com, taken by a single lens, webcam.
Office-Home is a recently released large-scale data set consisting of 15,500 images of 65 categories in Office and Home environments, such as knives, pens and keyboards. The images in this dataset consist of 4 fields, namely art (Ar), clip art (Cl), product (Pr) and Real World (RW). The style of the image may be reflected by the domain name. Specifically, art includes painting, sketching, and/or artistic portrayal. The clip-and-paste picture is composed of clip-and-paste picture images. The product consists of an image without background and the real world consists of an image taken by a camera.
2. Experimental setup
2.1), protocol
For fair comparison, we fully follow the unsupervised domain adaptation protocol widely used in existing image classification methods. Specifically, we use a labeled source domain image and an unlabeled target domain image. The existing image classification method is the classification precision in the target field. The parameter λ is chosen by significance weighted cross-validation on the source domain and β is fixed to 1.
2.2) detailed description
We performed the implementation based on PyTorch. The main parts of the implementation are the feature representation network and the domain arbiter. For number recognition, we use the same signature network reported by Hoffman et al, which is very similar to the convolutional neural network LeNet. For the evaluations performed on Office-31 and Office-Home, we used the residual network ResNet-50 as our feature representation network, which is the same as that used by existing image classification methods. As shown in fig. 2, the discriminator is embodied by a MLP (multi-layer perceptron) of three fully connected layers as shown in fig. 3. The classifier is embodied by one FC (fully connected) layer and one SoftMax classifier. The residual ResNet-50 used was pre-trained on the ImageNet dataset. Other networks, such as discriminators, classifiers, and feature representation networks for digital recognition, are trained ab initio on the evaluated data set. We used a minimum batch stochastic gradient descent with momentum of 0.95 for optimization. The learning rate is dynamically updated according to the strategy reported in (Ganin and Lempitsky 2014).
2.3) existing image classification method for comparison
To fully validate the effectiveness and superiority of the present invention, we compared it with baseline ResNet-50 without adaptation and several existing up-to-date image classification methods. Specifically, two conventional two-step methods with the latest characterization expressions GFK and TCA were compared. Since the method of the present invention can also be considered as an ADAN-based image classification method and a metric learning-based image classification method, we compare it with the metric learning-based image classification method DDC, ADAN method: RevGrad, MCD, CyCADA and CDAN were compared. Comparisons were also made with typical image classification methods from other protocols, such as DRCN, CoGAN, and UNIT. Since our implementation is under a standard unsupervised domain adaptation protocol and is the same as the setup of existing image classification methods, the baseline results are directly cited in the original paper.
3. Image classification result comparison
3.1), number recognition
First, we evaluated our approach on a relatively simple task: and recognizing handwritten numbers. The number recognition results are shown in table 1, and the number recognition results are expressed by percentage.
Figure BDA0003094482690000081
Figure BDA0003094482690000091
TABLE 1
In table 1, the best number recognition results are highlighted in bold numbers. The same experimental setup was used for the image classification methods except that UNIT used a larger training set on SVHN → MNIST. The digital recognition results of the comparative image classification methods are cited in the corresponding literature, and if the author does not perform the test, some digital recognition results are missing. ORACLE represents the numerical recognition result of the supervised model on the target domain.
Table 1 shows the numerical identification results obtained from MNIST → USPS, USPS → MNIST and SVHN → MNIST. As can be seen from table 1, domain adaptation is really effective for handling domain differences. On USPS → MNIST, the baseline without adaptation only reaches around 70%. The domain adaptation method can achieve 98% accuracy, which is very close to the accuracy of the model trained directly in the target domain. This shows that for relatively simple domain adaptation tasks, the current state-of-the-art approach provides a very practical and near ideal solution. We can also see that the method of the invention achieves the best numerical recognition results on all three evaluations. The improvements in MNIST → USPS and USPS → MNIST are relatively insignificant because the numerical recognition results are already very close to the upper bound. The most challenging digit recognition results on SVHN → MNIST in the three tests clearly demonstrate that the present invention is much better than the existing methods. It is worth noting that the present invention is 4.4% higher than CDAN because they are all under the ADAN framework. The result shows that the method has better generalization capability and the learning expression of the method has stronger portability. The digital identification results also verify that it is reasonable and valuable to solve the generalization problem in ADAN.
3.2) object recognition
The object recognition results for the Office-31 data set are shown in table 2, with the object recognition results being expressed in percentage.
Figure BDA0003094482690000092
Figure BDA0003094482690000101
TABLE 2
In table 2, the best object recognition results are highlighted in bold numbers. Since there are three domains in the dataset. We performed 3 × 2 ═ 6 evaluations in a paired fashion. A simple conclusion is that the method of the invention is able to achieve the best performance in all 6 evaluations. More deeply, we can find that the antagonism approach generally performs better than the metric-based approach. At the same time, we can also observe that these 6 evaluations are clearly distinct. When D and W are used as target domains, most methods can achieve better effect. However, when using a as the target domain, the accuracy may drop significantly. Thus, the object recognition results of W → A and D → A reflect the ability to handle more challenging domain adaptation tasks. We can see that the present invention is clearly superior to the current state-of-the-art methods in both of these evaluations. Specifically, the performance improvements of W → A and D → A are 4.2% and 2.7%, respectively. Finally, in terms of average improvement, it is worth noting that since most comparison methods have an accuracy rate of nearly 100% on both evaluation indexes W → D and D → W, the results are impaired by the averaging operation.
3.3) Large Scale object recognition
Finally, the method is tested on the recently released large-scale data set Office-Home. The Office-Home data set top-field adaptive object recognition results are shown in table 3, and the object recognition results are expressed by percentage.
Figure BDA0003094482690000102
TABLE 3
This dataset has 4 fields. Therefore, table 3 shows the results of 4 × 3 — 12 cross-domain object identifications. These object identification results also verify that the present invention is able to achieve the best object identification results at different evaluations. On average, the present invention improves the object recognition result by 2.4% compared to the best existing object recognition result. Notably, the mean was calculated from 12 evaluations. Therefore, it is very difficult to obtain such an effect. Compared with the CDAN, the object recognition performance of the present invention on the classical data set and the large-scale data set shows that our proposition to deal with the generalization problem in the ADAN is reasonable and practical, and therefore the latest and best object recognition result can be achieved.
4. Model analysis
Taking SVHN → MNIST as an example, we further analyze the method of the present invention by giving results of parameter sensitivity, convergence, distribution divergence, visualization and qualitative understanding.
4.1) parameter sensitivity
The invention relates to two hyper-parameters lambda and beta. Specifically, β is fixed to 1 because it performs well in the evaluation of the test. We present the results of the numerical identification of SVHN → MNIST with different lambda values in FIG. 4 (a). We observed that λ is relatively insensitive when small. The hyperparameters may be adjusted using importance weighted cross-validation.
Fig. 4(b) and (c) also reflect the convergence of our method, and fig. 4(d) shows the difference in the distribution of the different features.
4.2) fusion
It is widely accepted that the antagonism model is difficult to train. Therefore, we present in fig. 4(b) the test errors in different iterations of the inventive method on SVHN → MNIST, reflecting the convergence of the inventive method; the total loss of the method of the invention is given in fig. 4(c), reflecting the difference in the distribution of the different features, and these figures also explain the convergence trend of the invention. It is readily observed that the test error and total loss increase almost monotonically with the optimal number of iterations. The training of the present invention is effective, stable and fluid.
4.3) distribution divergence
While optimizing neural network distance is beneficial for generalization, it comes at the cost that small neural network distances do not guarantee small distribution divergence. Therefore, we further optimize the divergence measure in the model.
Assessment was performed using SVHN → MNIST and W → A.
Fig. 5(d) reports the difference between the two domain distributions in terms of the maximum mean difference. From fig. 5(a), (b), (c), we can also see that the distribution divergence of the representation learned by the present invention is minimal, which demonstrates that the present invention deals well with the disadvantage of neural network distance.
4.4), feature visualization
For a better understanding, we use t-SNE to visualize the learning representation of the present invention and give the results in fig. 5(a), (b), (c). We take SVHN → MNIST as an example. The original representation and the CDAN representation are also visualized for comparison. We visualize the characteristics of the source and target domains to show transferability and distinctiveness. From this fig. 5(a), (b), (c) it can be seen that the present invention has the best performance. The two domains align well and the different classes are clearly separated.
4.5), ablation study
The invention consists of an antagonism learning part and a metric learning part. Therefore, we present the image classification results of the present invention without adversarial learning and without metric learning, respectively. Take W → A as an example. The results are shown in FIG. 5 (d). It can be seen that good overall performance can only be obtained with co-optimization. The results prove that the formula is reasonable and practical in the invention.
4.6) qualitative image classification results
In addition to the visualization results, we further looked at CDAN misclassifications but our method handled a good sample. The experimental result verifies the effectiveness of the method in processing the generalization problem.
Figure 6 gives the qualitative image classification result. In fig. 6, 30 images randomly selected from the SVHN → MNIST evaluations are shown, 3 for each category, with the label under each image being in error and the predicted (number-recognized) result of the CDAN. These images are misclassified by the CDAN, but corrected by the present invention, the correct label for each column is 0-9 from left to right.
Some randomly selected images are given in fig. 6 due to space limitations. We can observe that some confusing categories, such as 4 and 9 in handwritten format, are misclassified by CDAN. However, the method of the invention considers the generalization characteristic and the distribution divergence at the same time, can correctly identify the number for all the images and has stronger robustness.
5. Conclusion
For the generalization problem in the adversarial domain adaptive network, the existing image classification method gives little attention. The invention reconstructs the traditional ADAN confrontation loss by utilizing the neural network distance with good generalization characteristic, and simultaneously optimizes the distribution divergence so as to reduce the cost brought by the neural network distance. A large number of experiments prove that compared with the existing antagonism self-adaptive image classification method, the method has the advantage that the accuracy rate is remarkably improved.
Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

Claims (2)

1. An image classification method based on an anti-domain adaptive network is characterized by comprising the following steps:
(1) construction of an adaptive network of the anti-domain for image classification
Constructing a feature representation network F, a domain discriminator D and a classifier C aiming at images in a source domain and a target domain;
wherein the feature representation network F is used for the image x in the source domainsCarrying out feature extraction to obtain source domain image features F (x)s) For image x in the target domaintCarrying out feature extraction to obtain the image feature F (x) of the target domaint) The domain discriminator D is used for the source domain image feature F (x)s) And target domain image features F (x)t) The probability D (F (x)) of the source domain is obtained by discriminations) And D (F (x))t) Classifier C is used to classify the source domain image features F (x)s) Classifying to obtain image classification probability pi(softmax(F(xs) I ═ 1,2, … I, for the target domain image feature F (x)t) Classifying to obtain image classification probability pi(softmax(F(xt)),i=1,2,…I;
(2) Training of an adaptive network of the antagonistic domain
2.1), extracting a batch of M images from the source domain, denoted asxs_1,xs_2,...,xs_MExtracting a batch of N images, denoted x, from the target domaint_1,xt_2,...,xt_NRespectively inputting the image features into a feature representation network F to obtain M source domain image features:
F(xs_1),F(xs_2),...,F(xs_M);
and N target domain image features:
F(xt_1),F(xt_2),...,F(xt_N);
m source domain image features F (x)s_1),F(xs_2),...,F(xs_M) And N target domain image features F (x)t_1),F(xt_2),...,F(xt_N) Respectively inputting the data into a domain discriminator D to obtain the probability of belonging to a source domain:
D(F(xs_1)),D(F(xs_2)),...,D(F(xs_M))
D(F(xt_1)),D(F(xt_2)),...,D(F(xt_N))
m source domain image features F (x)s_1),F(xs_2),...,F(xs_M) Respectively inputting the image data into a classifier C to obtain image classification probability:
pi(softmax(F(xs_1))),pi(softmax(F(xs_2))),...,pi(softmax(F(xs_M))),i=1,2,...,I;
wherein I is the number of the image types, and I is the number of the image types;
2.2), calculating the overall target function L of training:
L=minFmaxDLadv+λminFLmetric+βminCLcls
wherein:
Figure FDA0003094482680000021
Lmetric=E{k(F(xs_m),F(xs_m'))}m=1,2,...,M,m'=1,2,...,M,m≠m′+E{k(F(xt_n),F(xt_n'))}n=1,2,...,N,n'=1,2,...,N,n≠n′-2E{k(F(xs_m),F(xt_n))}m=1,2,...,M,n=1,2,...,N
Figure FDA0003094482680000022
wherein, minFmaxDLadvThe meaning of (A) is: by updating the network parameters characterizing the network F such that the penalty function L is opposedadvAt a minimum, the network parameters of the arbiter D are updated such that the penalty function L is opposedadvMaximum, formation of antagonistic training, minFLmetricThe meaning of (A) is: by updating the network parameters characterizing the network F such that the metric loss function LmetricMin is minimumCLclsThe meaning of (A) is: the network parameters of the network F are represented by updating features such that the cross entropy loss function LmetricAt minimum, λ, β are equilibrium parameters greater than 0;
wherein E represents the expectation value of all element values in the equation, k (,) is a Gaussian kernel function,
Figure FDA0003094482680000023
to indicate a function, when i ═ ys_mWhen it is 1, the rest is 0, ys_mAs source domain image xs_mTrue class (tag value), pi(softmax(F(xs_m) ()) represents the source domain image features F (x) of the classifier outputs_m) Probability of belonging to class i;
(3) image classification
Inputting an image of an unknown class from a source domain or a target domain into a feature representation network F for feature extraction to obtain image features, and then sending the image features into a classifier C to obtain probabilities belonging to each class, wherein the class corresponding to the maximum probability value is the class of the input image.
2. The method for image classification based on the countermeasure domain adaptive network of claim 1, wherein the feature representation network F is a residual network ResNet-50.
CN202110607513.9A 2021-06-01 2021-06-01 Image classification method based on countermeasure domain self-adaptive network Active CN113378904B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110607513.9A CN113378904B (en) 2021-06-01 2021-06-01 Image classification method based on countermeasure domain self-adaptive network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110607513.9A CN113378904B (en) 2021-06-01 2021-06-01 Image classification method based on countermeasure domain self-adaptive network

Publications (2)

Publication Number Publication Date
CN113378904A true CN113378904A (en) 2021-09-10
CN113378904B CN113378904B (en) 2022-06-14

Family

ID=77575267

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110607513.9A Active CN113378904B (en) 2021-06-01 2021-06-01 Image classification method based on countermeasure domain self-adaptive network

Country Status (1)

Country Link
CN (1) CN113378904B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673631A (en) * 2021-10-22 2021-11-19 广东众聚人工智能科技有限公司 Abnormal image detection method and device
CN114511737A (en) * 2022-01-24 2022-05-17 北京建筑大学 Training method of image recognition domain generalization model
CN114693972A (en) * 2022-03-29 2022-07-01 电子科技大学 Reconstruction-based intermediate domain self-adaptive method
CN117253097A (en) * 2023-11-20 2023-12-19 中国科学技术大学 Semi-supervision domain adaptive image classification method, system, equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443293A (en) * 2019-07-25 2019-11-12 天津大学 Based on double zero sample image classification methods for differentiating and generating confrontation network text and reconstructing
CN111368886A (en) * 2020-02-25 2020-07-03 华南理工大学 Sample screening-based label-free vehicle picture classification method
CN111524205A (en) * 2020-04-23 2020-08-11 北京信息科技大学 Image coloring processing method and device based on loop generation countermeasure network
CN111738315A (en) * 2020-06-10 2020-10-02 西安电子科技大学 Image classification method based on countermeasure fusion multi-source transfer learning
CN111814871A (en) * 2020-06-13 2020-10-23 浙江大学 Image classification method based on reliable weight optimal transmission
CN112131967A (en) * 2020-09-01 2020-12-25 河海大学 Remote sensing scene classification method based on multi-classifier anti-transfer learning
US20200411201A1 (en) * 2019-06-27 2020-12-31 Retrace Labs Systems And Method For Artificial-Intelligence-Based Dental Image To Text Generation
CN112183581A (en) * 2020-09-07 2021-01-05 华南理工大学 Semi-supervised mechanical fault diagnosis method based on self-adaptive migration neural network
US20210073630A1 (en) * 2019-09-10 2021-03-11 Robert Bosch Gmbh Training a class-conditional generative adversarial network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200411201A1 (en) * 2019-06-27 2020-12-31 Retrace Labs Systems And Method For Artificial-Intelligence-Based Dental Image To Text Generation
CN110443293A (en) * 2019-07-25 2019-11-12 天津大学 Based on double zero sample image classification methods for differentiating and generating confrontation network text and reconstructing
US20210073630A1 (en) * 2019-09-10 2021-03-11 Robert Bosch Gmbh Training a class-conditional generative adversarial network
CN111368886A (en) * 2020-02-25 2020-07-03 华南理工大学 Sample screening-based label-free vehicle picture classification method
CN111524205A (en) * 2020-04-23 2020-08-11 北京信息科技大学 Image coloring processing method and device based on loop generation countermeasure network
CN111738315A (en) * 2020-06-10 2020-10-02 西安电子科技大学 Image classification method based on countermeasure fusion multi-source transfer learning
CN111814871A (en) * 2020-06-13 2020-10-23 浙江大学 Image classification method based on reliable weight optimal transmission
CN112131967A (en) * 2020-09-01 2020-12-25 河海大学 Remote sensing scene classification method based on multi-classifier anti-transfer learning
CN112183581A (en) * 2020-09-07 2021-01-05 华南理工大学 Semi-supervised mechanical fault diagnosis method based on self-adaptive migration neural network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AARON CHADHA等: "Improved Techniques for Adversarial Discriminative Domain Adaptation", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 *
JINGJING LI等: "Structured Domain Adaptation", 《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》 *
贺海: "基于样本筛选的无监督领域自适应的图像分类研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
赵文仓等: "基于鉴别模型和对抗损失的无监督域自适应方法", 《高技术通讯》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673631A (en) * 2021-10-22 2021-11-19 广东众聚人工智能科技有限公司 Abnormal image detection method and device
CN113673631B (en) * 2021-10-22 2022-03-29 广东众聚人工智能科技有限公司 Abnormal image detection method and device
CN114511737A (en) * 2022-01-24 2022-05-17 北京建筑大学 Training method of image recognition domain generalization model
CN114511737B (en) * 2022-01-24 2022-09-09 北京建筑大学 Training method of image recognition domain generalization model
CN114693972A (en) * 2022-03-29 2022-07-01 电子科技大学 Reconstruction-based intermediate domain self-adaptive method
CN114693972B (en) * 2022-03-29 2023-08-29 电子科技大学 Intermediate domain field self-adaption method based on reconstruction
CN117253097A (en) * 2023-11-20 2023-12-19 中国科学技术大学 Semi-supervision domain adaptive image classification method, system, equipment and storage medium
CN117253097B (en) * 2023-11-20 2024-02-23 中国科学技术大学 Semi-supervision domain adaptive image classification method, system, equipment and storage medium

Also Published As

Publication number Publication date
CN113378904B (en) 2022-06-14

Similar Documents

Publication Publication Date Title
CN113378904B (en) Image classification method based on countermeasure domain self-adaptive network
Zhang et al. Unsupervised multi-class domain adaptation: Theory, algorithms, and practice
Ruff et al. A unifying review of deep and shallow anomaly detection
Fang et al. Open set domain adaptation: Theoretical bound and algorithm
Shen et al. Wasserstein distance guided representation learning for domain adaptation
Zhu et al. Detecting corrupted labels without training a model to predict
Ding et al. DECODE: Deep confidence network for robust image classification
Chherawala et al. Feature set evaluation for offline handwriting recognition systems: application to the recurrent neural network model
Akbari et al. How does loss function affect generalization performance of deep learning? Application to human age estimation
Yu et al. Simple and effective stochastic neural networks
Littwin et al. The multiverse loss for robust transfer learning
Liu et al. Exploiting web images for fine-grained visual recognition by eliminating open-set noise and utilizing hard examples
Wu et al. Spatial–temporal relation reasoning for action prediction in videos
Li et al. Locality linear fitting one-class SVM with low-rank constraints for outlier detection
Gu et al. Unsupervised and semi-supervised robust spherical space domain adaptation
Wang et al. BP-triplet net for unsupervised domain adaptation: A Bayesian perspective
Li et al. Subspace-based minority oversampling for imbalance classification
Li et al. Robust multi-label semi-supervised classification
Hwang et al. Exploiting transferable knowledge for fairness-aware image classification
Yang et al. A feature learning approach for face recognition with robustness to noisy label based on top-N prediction
Lee et al. Neuralfp: out-of-distribution detection using fingerprints of neural networks
Zhao et al. Domain adaptation with feature and label adversarial networks
Du et al. Learning transferable and discriminative features for unsupervised domain adaptation
He et al. Addressing the Overfitting in Partial Domain Adaptation with Self-Training and Contrastive Learning
Ho et al. Document classification in a non-stationary environment: A one-class svm approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231027

Address after: Room 209, Building 1, No. 36 Xiyong Avenue, Shapingba District, Chongqing, 400000

Patentee after: Xinchen (Chongqing) Microelectronics Co.,Ltd.

Address before: 611731, No. 2006, West Avenue, Chengdu hi tech Zone (West District, Sichuan)

Patentee before: University of Electronic Science and Technology of China

TR01 Transfer of patent right