CN113378904A

CN113378904A - Image classification method based on anti-domain adaptive network

Info

Publication number: CN113378904A
Application number: CN202110607513.9A
Authority: CN
Inventors: 贾龙飞; 李晶晶; 杜哲凯
Original assignee: University of Electronic Science and Technology of China
Current assignee: Xinchen (Chongqing) Microelectronics Co.,Ltd.
Priority date: 2021-06-01
Filing date: 2021-06-01
Publication date: 2021-09-10
Anticipated expiration: 2041-06-01
Also published as: CN113378904B

Abstract

The invention discloses an image classification method based on an antagonistic domain self-adaptive network, which redefines an antagonistic loss function L_advThe problem of source domain and target domain generalization is overcome, but redefining the loss-resistant function is costly, and the source domain and target domain distributions cannot be guaranteed to be close by only optimizing a new ADAN target function, i.e. the loss-resistant function, so the invention adopts the method of minimizing the additional divergence, i.e. the metric loss function L_metricTo align the source domain and the target domain. By adopting the antagonistic learning and the metric learning, on one hand, the generalization problem in the prior ADAN is challenged, and on the other hand, the domain divergence is ensured to be minimized in the training process. Experiments show that the method can be well suitable for the unsupervised field self-adaptive task, and the classification performance (accuracy rate) of the target domain image is improved.

Description

Image classification method based on anti-domain adaptive network

Technical Field

The invention belongs to the technical field of image classification, and particularly relates to an image classification method based on an anti-domain adaptive network.

Background

One common assumption behind most machine learning models is: the source domain and the target domain have the same data distribution. However, this common assumption cannot be guaranteed in real-world applications, which may result in a drastic degradation of the performance of classifying the target domain data. Therefore, domain adaptation is proposed to solve this problem by reducing domain differences. The latest Domain Adaptation methods are based on Adversarial learning, and these methods are generally called Adversarial Domain Adaptation Networks (ADAN).

Fig. 1 is a schematic diagram illustrating the principle of an image classification method based on a domain-robust adaptive network.

ADAN is similar to a generative confrontation network (GANS) that trains a feature representation network F (similar to the producers in a generative confrontation network) and a domain arbiter D in a confrontational manner. Specifically, in the image classification method based on the anti-domain adaptive network, images of a source domain and an object domain are respectively input into a feature representation network F for feature extraction to obtain image features, and then the image features are respectively sent into a domain discriminator D, and once the domain discriminator D cannot distinguish whether the image features are from the source domain or the object domain, the learned image features are considered to be unchanged. Thus, the feature representation network F and the classifier C are trained through the source domain image, and the trained feature representation network F and the classifier C can be suitable for classification of the target domain image.

In domain adaptation, the source domain and the target domain have different data distributions, and the goal of domain adaptation is to learn a new feature representation so that the source domain and the target domain can be well aligned. ADAN takes advantage of the idea of adversarial learning, which assumes that the two domains are aligned as long as the domain discriminators are confused. However, recent advances have shown that such assumptions may not be reliable. ADAN, however, inherits the shortcomings of GANS, and even if training is successful, the learned distribution may be far from the expected distribution, which is referred to as the generalization problem in GANS. Therefore, even if the domain discriminator is successfully confused, it cannot be guaranteed that the learned representation is domain-invariant, that is, the trained feature representation network F and the classifier C cannot well classify the target domain image, and the classification performance needs to be improved.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides an image classification method based on an anti-domain adaptive network, overcomes the generalization of a source domain and a target domain, aligns the source domain and the target domain, and improves the classification performance (accuracy rate) of images in the target domain.

In order to achieve the above object, the present invention provides an image classification method based on an anti-domain adaptive network, comprising the steps of:

(1) construction of an adaptive network of the anti-domain for image classification

Constructing a feature representation network F, a domain discriminator D and a classifier C aiming at images in a source domain and a target domain;

wherein the feature representation network F is used for the image x in the source domain_sCarrying out feature extraction to obtain source domain image features F (x)_s) For image x in the target domain_tCarrying out feature extraction to obtain the image feature F (x) of the target domain_t) The domain discriminator D is used for the source domain image feature F (x)_s) And target domain image features F (x)_t) The probability D (F (x)) of the source domain is obtained by discrimination_s) And D (F (x))_t) Classifier C is used to classify the source domain image features F (x)_s) Classifying to obtain image classification probability p_i(softmax(F(x_s) I ═ 1,2, … I, for the target domain image feature F (x)_t) Classifying to obtain image classification probability p_i(softmax(F(x_t))，i＝1，2，…I；

(2) Training of an adaptive network of the antagonistic domain

2.1), extracting a batch of M images from the source domain, denoted x_{s_1}，x_{s_2}，...,x_{s_M}Extracting a batch of N images, denoted x, from the target domain_{t_1},x_{t_2},...，x_{t_N}Respectively inputting the image features into a feature representation network F to obtain M source domain image features:

F(x_{s_1})，F(x_{s_2}),...,F(x_{s_M})；

and N target domain image features:

F(x_{t_1}),F(x_{t_2}),...,F(x_{t_N})；

m source domain graphsImage feature F (x)_{s_1}),F(x_{s_2}),...,F(x_{s_M}) And N target domain image features F (x)_{t_1}),F(x_{t_2}),...,F(x_{t_N}) Respectively inputting the data into a domain discriminator D to obtain the probability of belonging to a source domain:

D(F(x_{s_1})),D(F(x_{s_2})),...,D(F(x_{s_M}))

D(F(x_{t_1})),D(F(x_{t_2})),...,D(F(x_{t_N}))

m source domain image features F (x)_{s_1}),F(x_{s_2}),...,F(x_{s_M}) Respectively inputting the image data into a classifier C to obtain image classification probability:

p_i(softmax(F(x_{s_1})))，p_i(softmax(F(x_{s_2})))，…，p_i(softmax(F(x_{s_M})))，i＝1，2，…，I；

wherein I is the number of the image types, and I is the number of the image types;

2.2), calculating the overall target function L of training:

L＝min_Fmax_DL_adv+λmin_FL_metric+βmin_CL_cls

wherein:

L_metric＝E{k(F(x_{s_m})，F(x_{s_m′}))}_{m＝1，2，…，M，m′＝1，2，…，M，m≠m′} +E{k(F(x_{t_n})，F(x_{t_n′}))}_{n＝1，2，…，N，n′＝1，2，…，N，n≠n′} -2E{k(F(x_{s_m})，F(x_{t_n}))}_{m＝1，2，…，M，n＝1，2，…，N}

wherein, min_Fmax_DL_advOf (1) containsMeaning as follows: by updating the network parameters characterizing the network F such that the penalty function L is opposed_advAt a minimum, the network parameters of the arbiter D are updated such that the penalty function L is opposed_advMaximum, formation of antagonistic training, min_FL_metricThe meaning of (A) is: by updating the network parameters characterizing the network F such that the metric loss function L_metricMin is minimum_CL_clsThe meaning of (A) is: the network parameters of the network F are represented by updating features such that the cross entropy loss function L_metricAt minimum, λ, β are equilibrium parameters greater than 0;

wherein E represents the expectation value of all element values in the equation, k (,) is a Gaussian kernel function,

to indicate a function, when i ═ y_{s_m}When it is 1, the rest is 0, y_{s_m}As source domain image x_{s_m}True class (tag value), p_i(softmax(F(x_{s_m}) ()) represents the source domain image features F (x) of the classifier output_{s_m}) Probability of belonging to class i;

(3) image classification

Inputting an image of an unknown class from a source domain or a target domain into a feature representation network F for feature extraction to obtain image features, and then sending the image features into a classifier C to obtain probabilities belonging to each class, wherein the class corresponding to the maximum probability value is the class of the input image.

The invention aims to realize the following steps:

the invention discloses an image classification method based on an anti-domain adaptive network, which has the following defects that ADAN inherits the GANS: in the field of image classification technology, even if training is successful, the distribution of the learned target domain may be far from the expected distribution, that is, there is a generalization problem and it cannot be guaranteed that the learned image features are domain-invariant. Aiming at the problem, the invention redefines the resistance loss function L_advOvercoming the problem of source and target domain generalization, but redefining the loss-immunity function is costly, and the source domain cannot be guaranteed by merely optimizing a new ADAN target function, i.e., the loss-immunity functionAnd the target domain distribution is close, the invention therefore uses minimizing the additional divergence, i.e. the metric loss function L_metricTo align the source domain and the target domain. By adopting the antagonistic learning and the metric learning, on one hand, the generalization problem in the prior ADAN is challenged, and on the other hand, the domain divergence is ensured to be minimized in the training process. Experiments show that the method can be well suitable for the unsupervised field self-adaptive task, and the classification performance (accuracy rate) of the target domain image is improved.

Drawings

FIG. 1 is a schematic diagram of an image classification method based on a domain-robust adaptive network in the prior art;

FIG. 2 is a flowchart of an embodiment of an image classification method based on an anti-domain adaptive network according to the present invention;

FIG. 3 is a schematic diagram of an embodiment of an adaptive anti-domain network for image classification according to the present invention;

FIG. 4 is the results of the model analysis of the present invention, wherein (a) the parametric sensitivity of λ is given, (b) the test errors in different iterations are given, (c) the total loss value during training is given, and (d) the distribution difference of different features is given;

FIG. 5 is a visualization of t-SNE and learned representations of ablation studies, where (a), (b), and (c) visualize the original representation (non-adapted), CDAN representation, and invention representation, respectively, and the numbers near each cluster are the corresponding category labels, and (d) give the results of ablation studies, where w/o is an abbreviation for none and adv is an abbreviation for resistant learning.

Fig. 6 is a qualitative image classification result diagram.

Detailed Description

The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.

Since ADAN is similar to GANS, it also has a generalization problem. To this end, the invention is directed to ADANRestated to be suitable for image classification. But the reformulation is costly, e.g. by only optimizing the new ADAN anti-loss function L_advThere is no guarantee that the source domain and target domain distributions are close. Thus, the present invention further aligns the two domains by minimizing the additional divergence. Finally, supervised classification loss over the source domain is exploited to ensure the discriminativity of the learned image features.

FIG. 2 is a flowchart of an embodiment of an image classification method based on an anti-domain adaptive network.

In this embodiment, as shown in fig. 2, the image classification method based on the robust domain adaptive network of the present invention includes the following steps:

step S1: constructing an adaptive network of the anti-domain for image classification

As shown in fig. 3, a feature representation network F, a domain discriminator D and a classifier C are constructed for the images in the source domain and the target domain, wherein the feature representation network F is used for the image x in the source domain_sCarrying out feature extraction to obtain source domain image features F (x)_s) For image x in the target domain_tCarrying out feature extraction to obtain the image feature F (x) of the target domain_t) The domain discriminator D is used for the source domain image feature F (x)_s) And target domain image features F (x)_t) The probability D (F (x)) of the source domain is obtained by discrimination_s) And D (F (x))_t) Classifier C is used to classify the source domain image features F (x)_s) Classifying to obtain image classification probability p_i(softmax(F(x_s) I ═ 1,2, … I, for the target domain image feature F (x)_t) Classifying to obtain image classification probability p_i(softmax(F(x_t)),i＝1,2,…I。

The feature representation network F, the domain discriminator D and the classifier C in fig. 3 are all conventional image feature extraction networks, discriminators and classifiers, for example, the feature representation network F may use a common ResNet-50 residual neural network, and will not be described herein again.

Step S2: training of an adaptive network of the countermeasure domain

Step S2.1: set Batch Size (Batch Size) to M, extract one from the source domainBatch M images, denoted x_{s_1},x_{s_2},...,x_{s_M}Setting the Batch Size (Batch Size) to N, extracting a Batch of N images from the target domain, denoted as x_{t_1},x_{t_2},...,x_{t_N}Respectively inputting the image features into a feature representation network F to obtain M source domain image features:

F(x_{s_1}),F(x_{s_2}),...,F(x_{s_M})；

and N target domain image features:

F(x_{t_1}),F(x_{t_2}),...,F(x_{t_N})。

m source domain image features F (x)_{s_1})，F(x_{s_2})，...，F(x_{s_M}) And N target domain image features F (x)_{t_1}),F(x_{t_2}),...,F(x_{t_N}) Respectively inputting the data into a domain discriminator D to obtain the probability of belonging to a source domain:

D(F(x_{s_1})),D(F(x_{s_2})),...,D(F(x_{s_M}))

D(F(x_{t_1}))，D(F(x_{t_2})),...,D(F(x_{t_N}))

wherein, I is the number of the image types, and I is the number of the image types.

Step S2.2: calculating a trained global objective function L:

L＝min_Fmax_DL_adv+λmin_FL_metric+βmin_CL_cls

wherein:

wherein, min_Fmax_DL_advThe meaning of (A) is: by updating the network parameters characterizing the network F such that the penalty function L is opposed_advAt a minimum, the network parameters of the arbiter D are updated such that the penalty function L is opposed_advMaximum, formation of antagonistic training, min_FL_metricThe meaning of (A) is: by updating the network parameters characterizing the network F such that the metric loss function L_metricMin is minimum_CL_clsThe meaning of (A) is: the network parameters of the network F are represented by updating features such that the cross entropy loss function L_metricAt minimum, λ, β are equilibrium parameters greater than 0.

to indicate a function, when i ═ y_{s_m}When it is 1, the rest is 0, y_{s_m}As source domain image x_{s_m}True class (tag value), p_i(softimax(F(x_{s_m}) ()) represents the source domain image features F (x) of the classifier output_{s_m}) Probability of belonging to class i.

In the invention, aiming at the generalization problem of ADAN, inspired by F divergence, the invention provides a new loss-resisting function L_advNamely:

penalty function L_advDistance F, while it overcomes the generalization problem, is costly. In particular, the F distance may be small even if the two source domain distributions P and the target domain distribution Q are not very close. Such a cost is unacceptable in the task of domain adaptation, which destroys the goal of domain adaptation, i.e. aligning the source domain distribution P and the target domain distribution Q. Therefore, in the invention, a measurement loss function is additionally optimized to measure the divergence of the source domain distribution P and the target domain distribution Q, so that the generalization distance and the distribution divergence can be ensured.

Step S3: image classification

Experimental verification

1. Introduction to data set

MNIST, USPS and SVHN are three widely used handwritten digit data sets. Specifically, the MNIST dataset consists of 6 ten thousand training images and 1 ten thousand test images, and is 28 × 28 in size. There are 7,291 16 x 16 training images and 2,007 test images of the same size in the USPS. SVHN is an abbreviation for street view house number and is more challenging because it has different backgrounds in different images, containing more than 60 ten thousand label numbers cropped from the street view image. The number of categories in each digital data set is 10, i.e. from 0 to 9. In this experiment, we used the same settings as in the existing image classification method in order to make a fair comparison.

The data set Office-31 includes 4,652 images of 31 categories that are common in Office settings, such as calculators, projectors, and printers. The image of the data set Office-31 is from three different subsets, namely amazon (a), digital single lens (D), and webcam (W). Com, taken by a single lens, webcam.

Office-Home is a recently released large-scale data set consisting of 15,500 images of 65 categories in Office and Home environments, such as knives, pens and keyboards. The images in this dataset consist of 4 fields, namely art (Ar), clip art (Cl), product (Pr) and Real World (RW). The style of the image may be reflected by the domain name. Specifically, art includes painting, sketching, and/or artistic portrayal. The clip-and-paste picture is composed of clip-and-paste picture images. The product consists of an image without background and the real world consists of an image taken by a camera.

2. Experimental setup

2.1), protocol

For fair comparison, we fully follow the unsupervised domain adaptation protocol widely used in existing image classification methods. Specifically, we use a labeled source domain image and an unlabeled target domain image. The existing image classification method is the classification precision in the target field. The parameter λ is chosen by significance weighted cross-validation on the source domain and β is fixed to 1.

2.2) detailed description

We performed the implementation based on PyTorch. The main parts of the implementation are the feature representation network and the domain arbiter. For number recognition, we use the same signature network reported by Hoffman et al, which is very similar to the convolutional neural network LeNet. For the evaluations performed on Office-31 and Office-Home, we used the residual network ResNet-50 as our feature representation network, which is the same as that used by existing image classification methods. As shown in fig. 2, the discriminator is embodied by a MLP (multi-layer perceptron) of three fully connected layers as shown in fig. 3. The classifier is embodied by one FC (fully connected) layer and one SoftMax classifier. The residual ResNet-50 used was pre-trained on the ImageNet dataset. Other networks, such as discriminators, classifiers, and feature representation networks for digital recognition, are trained ab initio on the evaluated data set. We used a minimum batch stochastic gradient descent with momentum of 0.95 for optimization. The learning rate is dynamically updated according to the strategy reported in (Ganin and Lempitsky 2014).

2.3) existing image classification method for comparison

To fully validate the effectiveness and superiority of the present invention, we compared it with baseline ResNet-50 without adaptation and several existing up-to-date image classification methods. Specifically, two conventional two-step methods with the latest characterization expressions GFK and TCA were compared. Since the method of the present invention can also be considered as an ADAN-based image classification method and a metric learning-based image classification method, we compare it with the metric learning-based image classification method DDC, ADAN method: RevGrad, MCD, CyCADA and CDAN were compared. Comparisons were also made with typical image classification methods from other protocols, such as DRCN, CoGAN, and UNIT. Since our implementation is under a standard unsupervised domain adaptation protocol and is the same as the setup of existing image classification methods, the baseline results are directly cited in the original paper.

3. Image classification result comparison

3.1), number recognition

First, we evaluated our approach on a relatively simple task: and recognizing handwritten numbers. The number recognition results are shown in table 1, and the number recognition results are expressed by percentage.

TABLE 1

In table 1, the best number recognition results are highlighted in bold numbers. The same experimental setup was used for the image classification methods except that UNIT used a larger training set on SVHN → MNIST. The digital recognition results of the comparative image classification methods are cited in the corresponding literature, and if the author does not perform the test, some digital recognition results are missing. ORACLE represents the numerical recognition result of the supervised model on the target domain.

Table 1 shows the numerical identification results obtained from MNIST → USPS, USPS → MNIST and SVHN → MNIST. As can be seen from table 1, domain adaptation is really effective for handling domain differences. On USPS → MNIST, the baseline without adaptation only reaches around 70%. The domain adaptation method can achieve 98% accuracy, which is very close to the accuracy of the model trained directly in the target domain. This shows that for relatively simple domain adaptation tasks, the current state-of-the-art approach provides a very practical and near ideal solution. We can also see that the method of the invention achieves the best numerical recognition results on all three evaluations. The improvements in MNIST → USPS and USPS → MNIST are relatively insignificant because the numerical recognition results are already very close to the upper bound. The most challenging digit recognition results on SVHN → MNIST in the three tests clearly demonstrate that the present invention is much better than the existing methods. It is worth noting that the present invention is 4.4% higher than CDAN because they are all under the ADAN framework. The result shows that the method has better generalization capability and the learning expression of the method has stronger portability. The digital identification results also verify that it is reasonable and valuable to solve the generalization problem in ADAN.

3.2) object recognition

The object recognition results for the Office-31 data set are shown in table 2, with the object recognition results being expressed in percentage.

TABLE 2

In table 2, the best object recognition results are highlighted in bold numbers. Since there are three domains in the dataset. We performed 3 × 2 ═ 6 evaluations in a paired fashion. A simple conclusion is that the method of the invention is able to achieve the best performance in all 6 evaluations. More deeply, we can find that the antagonism approach generally performs better than the metric-based approach. At the same time, we can also observe that these 6 evaluations are clearly distinct. When D and W are used as target domains, most methods can achieve better effect. However, when using a as the target domain, the accuracy may drop significantly. Thus, the object recognition results of W → A and D → A reflect the ability to handle more challenging domain adaptation tasks. We can see that the present invention is clearly superior to the current state-of-the-art methods in both of these evaluations. Specifically, the performance improvements of W → A and D → A are 4.2% and 2.7%, respectively. Finally, in terms of average improvement, it is worth noting that since most comparison methods have an accuracy rate of nearly 100% on both evaluation indexes W → D and D → W, the results are impaired by the averaging operation.

3.3) Large Scale object recognition

Finally, the method is tested on the recently released large-scale data set Office-Home. The Office-Home data set top-field adaptive object recognition results are shown in table 3, and the object recognition results are expressed by percentage.

TABLE 3

This dataset has 4 fields. Therefore, table 3 shows the results of 4 × 3 — 12 cross-domain object identifications. These object identification results also verify that the present invention is able to achieve the best object identification results at different evaluations. On average, the present invention improves the object recognition result by 2.4% compared to the best existing object recognition result. Notably, the mean was calculated from 12 evaluations. Therefore, it is very difficult to obtain such an effect. Compared with the CDAN, the object recognition performance of the present invention on the classical data set and the large-scale data set shows that our proposition to deal with the generalization problem in the ADAN is reasonable and practical, and therefore the latest and best object recognition result can be achieved.

4. Model analysis

Taking SVHN → MNIST as an example, we further analyze the method of the present invention by giving results of parameter sensitivity, convergence, distribution divergence, visualization and qualitative understanding.

4.1) parameter sensitivity

The invention relates to two hyper-parameters lambda and beta. Specifically, β is fixed to 1 because it performs well in the evaluation of the test. We present the results of the numerical identification of SVHN → MNIST with different lambda values in FIG. 4 (a). We observed that λ is relatively insensitive when small. The hyperparameters may be adjusted using importance weighted cross-validation.

Fig. 4(b) and (c) also reflect the convergence of our method, and fig. 4(d) shows the difference in the distribution of the different features.

4.2) fusion

It is widely accepted that the antagonism model is difficult to train. Therefore, we present in fig. 4(b) the test errors in different iterations of the inventive method on SVHN → MNIST, reflecting the convergence of the inventive method; the total loss of the method of the invention is given in fig. 4(c), reflecting the difference in the distribution of the different features, and these figures also explain the convergence trend of the invention. It is readily observed that the test error and total loss increase almost monotonically with the optimal number of iterations. The training of the present invention is effective, stable and fluid.

4.3) distribution divergence

While optimizing neural network distance is beneficial for generalization, it comes at the cost that small neural network distances do not guarantee small distribution divergence. Therefore, we further optimize the divergence measure in the model.

Assessment was performed using SVHN → MNIST and W → A.

Fig. 5(d) reports the difference between the two domain distributions in terms of the maximum mean difference. From fig. 5(a), (b), (c), we can also see that the distribution divergence of the representation learned by the present invention is minimal, which demonstrates that the present invention deals well with the disadvantage of neural network distance.

4.4), feature visualization

For a better understanding, we use t-SNE to visualize the learning representation of the present invention and give the results in fig. 5(a), (b), (c). We take SVHN → MNIST as an example. The original representation and the CDAN representation are also visualized for comparison. We visualize the characteristics of the source and target domains to show transferability and distinctiveness. From this fig. 5(a), (b), (c) it can be seen that the present invention has the best performance. The two domains align well and the different classes are clearly separated.

4.5), ablation study

The invention consists of an antagonism learning part and a metric learning part. Therefore, we present the image classification results of the present invention without adversarial learning and without metric learning, respectively. Take W → A as an example. The results are shown in FIG. 5 (d). It can be seen that good overall performance can only be obtained with co-optimization. The results prove that the formula is reasonable and practical in the invention.

4.6) qualitative image classification results

In addition to the visualization results, we further looked at CDAN misclassifications but our method handled a good sample. The experimental result verifies the effectiveness of the method in processing the generalization problem.

Figure 6 gives the qualitative image classification result. In fig. 6, 30 images randomly selected from the SVHN → MNIST evaluations are shown, 3 for each category, with the label under each image being in error and the predicted (number-recognized) result of the CDAN. These images are misclassified by the CDAN, but corrected by the present invention, the correct label for each column is 0-9 from left to right.

Some randomly selected images are given in fig. 6 due to space limitations. We can observe that some confusing categories, such as 4 and 9 in handwritten format, are misclassified by CDAN. However, the method of the invention considers the generalization characteristic and the distribution divergence at the same time, can correctly identify the number for all the images and has stronger robustness.

5. Conclusion

For the generalization problem in the adversarial domain adaptive network, the existing image classification method gives little attention. The invention reconstructs the traditional ADAN confrontation loss by utilizing the neural network distance with good generalization characteristic, and simultaneously optimizes the distribution divergence so as to reduce the cost brought by the neural network distance. A large number of experiments prove that compared with the existing antagonism self-adaptive image classification method, the method has the advantage that the accuracy rate is remarkably improved.

Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

Claims

1. An image classification method based on an anti-domain adaptive network is characterized by comprising the following steps:

wherein the feature representation network F is used for the image x in the source domain_sCarrying out feature extraction to obtain source domain image features F (x)_s) For image x in the target domain_tCarrying out feature extraction to obtain the image feature F (x) of the target domain_t) The domain discriminator D is used for the source domain image feature F (x)_s) And target domain image features F (x)_t) The probability D (F (x)) of the source domain is obtained by discrimination_s) And D (F (x))_t) Classifier C is used to classify the source domain image features F (x)_s) Classifying to obtain image classification probability p_i(softmax(F(x_s) I ═ 1,2, … I, for the target domain image feature F (x)_t) Classifying to obtain image classification probability p_i(softmax(F(x_t)),i＝1,2,…I；

(2) Training of an adaptive network of the antagonistic domain

2.1), extracting a batch of M images from the source domain, denoted asx_{s_1},x_{s_2},...,x_{s_M}Extracting a batch of N images, denoted x, from the target domain_{t_1},x_{t_2},...,x_{t_N}Respectively inputting the image features into a feature representation network F to obtain M source domain image features:

F(x_{s_1}),F(x_{s_2}),...,F(x_{s_M})；

and N target domain image features:

F(x_{t_1}),F(x_{t_2}),...，F(x_{t_N})；

m source domain image features F (x)_{s_1}),F(x_{s_2}),...,F(x_{s_M}) And N target domain image features F (x)_{t_1}),F(x_{t_2})，...，F(x_{t_N}) Respectively inputting the data into a domain discriminator D to obtain the probability of belonging to a source domain:

D(F(x_{s_1}))，D(F(x_{s_2})),...,D(F(x_{s_M}))

D(F(x_{t_1})),D(F(x_{t_2})),...,D(F(x_{t_N}))

p_i(softmax(F(x_{s_1}))),p_i(softmax(F(x_{s_2}))),...,p_i(softmax(F(x_{s_M}))),i＝1,2,...,I；

2.2), calculating the overall target function L of training:

L＝min_Fmax_DL_adv+λmin_FL_metric+βmin_CL_cls

wherein:

L_metric＝E{k(F(x_{s_m}),F(x_{s_m'}))}_{m＝1,2,...,M,m'＝1,2,...,M,m≠m′}+E{k(F(x_{t_n}),F(x_{t_n'}))}_{n＝1,2,...,N,n'＝1,2,...,N,n≠n′}-2E{k(F(x_{s_m}),F(x_{t_n}))}_{m＝1,2,...，M，n＝1，2，...，N}

wherein, min_Fmax_DL_advThe meaning of (A) is: by updating the network parameters characterizing the network F such that the penalty function L is opposed_advAt a minimum, the network parameters of the arbiter D are updated such that the penalty function L is opposed_advMaximum, formation of antagonistic training, min_FL_metricThe meaning of (A) is: by updating the network parameters characterizing the network F such that the metric loss function L_metricMin is minimum_CL_clsThe meaning of (A) is: the network parameters of the network F are represented by updating features such that the cross entropy loss function L_metricAt minimum, λ, β are equilibrium parameters greater than 0;

(3) image classification

2. The method for image classification based on the countermeasure domain adaptive network of claim 1, wherein the feature representation network F is a residual network ResNet-50.