CN113076994B

CN113076994B - Open-set domain self-adaptive image classification method and system

Info

Publication number: CN113076994B
Application number: CN202110349864.4A
Authority: CN
Inventors: 张庆亮; 朱松豪; 梁志伟
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2021-03-31
Filing date: 2021-03-31
Publication date: 2022-08-05
Anticipated expiration: 2041-03-31
Also published as: CN113076994A

Abstract

The invention discloses an open set domain self-adaptive image classification method and a system in the technical field of image recognition, which utilize a channel attention mechanism to enable a network to better acquire domain invariant features, facilitate the migration of the features and simultaneously enable the features to be easier to train. Respectively inputting labeled samples acquired from a source domain and unlabeled samples acquired from a target domain into a feature extractor based on a channel attention module to acquire a weighted multi-channel feature map; sending the weighted multi-channel feature map into a label classifier, dividing the labeled samples into K known classes, and dividing the unlabeled samples into K known classes which are visible in a source domain and an unknown class which is invisible in the source domain; sending the known categories from the source domain and the target domain into a domain discriminator, and generating network strengthening domain invariant feature extraction based on the countermeasure; based on the covariance match, the inter-domain difference between the source domain and the target domain is narrowed.

Description

Open-set domain self-adaptive image classification method and system

Technical Field

The invention belongs to the technical field of image recognition, and particularly relates to an open set domain self-adaptive image classification method and system.

Background

In recent years, with the development of deep learning techniques, computer vision has received a great deal of attention and has made great progress. Image classification is a classic problem in computer vision, and is widely applied to daily production and life, such as medical image recognition, face recognition, license plate recognition, remote sensing image classification and the like. Currently, the traditional depth model is mainly obtained by performing a large amount of learning on data in a specific learning scene. However, to construct such a good deep neural network, a large amount of labeled training data is often required. In some fields (such as medical images) with scarce labeling and strong specialization, the method is difficult to obtain.

Closed domain adaptation is the most basic and interesting class of settings, and the most basic assumption of this class of methods is that the source domain and the target domain share the same exact category. The performance of these methods is greatly degraded once the target domain contains classes that are not present in the source domain. An open set domain based adaptive algorithm. There are also some problems, such as insufficient alignment of source domain and target domain features, easy coupling of training, etc. In addition, the method also has the problems of long model training time, difficult convergence and the like.

Disclosure of Invention

In order to solve the defects in the prior art, the invention provides an open set domain self-adaptive image classification method and system, which utilize a channel attention mechanism to enable a network to better acquire domain invariant features, facilitate the migration of the features and simultaneously enable the features to be easier to train.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

in a first aspect, an open set domain self-adaptive image classification method is provided, in which an acquired image is input into a trained open set domain self-adaptive image classification model to obtain an image category.

Further, the training method of the open domain adaptive image classification model comprises the following steps: respectively inputting labeled samples acquired from a source domain and unlabeled samples acquired from a target domain into a feature extractor based on a channel attention module to acquire a weighted multi-channel feature map; sending the weighted multi-channel feature map into a label classifier, dividing the labeled samples into K known classes, and dividing the unlabeled samples into K known classes which are visible in a source domain and an unknown class which is invisible in the source domain; sending the known categories from the source domain and the target domain into a domain discriminator, and generating network strengthening domain invariant feature extraction based on the countermeasure; based on the covariance match, the inter-domain difference between the source domain and the target domain is narrowed.

Further, the step of inputting the labeled samples obtained from the source domain and the unlabeled samples obtained from the target domain into a feature extractor based on a channel attention module respectively to obtain a weighted multi-channel feature map includes: inputting the labeled samples and the unlabeled samples into a convolutional neural network, and obtaining an original multichannel characteristic diagram through convolution operation; inputting the original multi-channel feature map into a channel attention module, and obtaining channel information through maximum pooling, average pooling and self-adjusting pooling; inputting channel information into an autonomous learning layer, and acquiring the weight of each channel through Sigmoid operation:

ω＝Sig{Conv1D[AVGPool(y)]+Conv1D[MAXPool(y)]+Conv1D[AdaPool(y)]} (1)

wherein, ω represents a channel weight, Sig represents a Sigmoid function, Conv1D represents a one-dimensional convolution operation, y represents an original multi-channel characteristic diagram, AVGPool represents average pooling, MAXPool represents maximum pooling, AdaPool represents self-adjusting pooling;

based on the weight of each channel, obtaining a weighted multi-channel feature map y' as follows:

y'＝ω*y (1-1)。

further, the self-adjusting pooling transforms the two-dimensional feature map into a 1 × 1 feature using two-dimensional convolution and the ReLU activation function.

Further, in the process of dividing the labeled samples into K known classes, the classification loss function is:

wherein L is _y Represents the cross entropy loss of the standard, C _y Represents a classifier, | D _S And | represents the total number of source domain samples.

Further, in the process of dividing the unlabeled exemplars into K known classes visible in the source domain and one unknown class invisible in the source domain, a two-class cross entropy loss is used:

wherein, X _t Sample representing the target Domain, p (X) _t Y ═ K +1) represents sample X _t The probability of belonging to the (K +1) th class, and β represents the probability of an unknown class.

Further, the sending the known categories from the source domain and the target domain into the domain discriminator and the network strengthening domain invariant feature extraction based on the countermeasure generation comprise: obtaining a target sample X _t Predicting the probability of each category, if the category with the highest probability is a known category and the probability value is greater than a threshold value P, marking the sample as the known category in the target domain, wherein the corresponding formula is as follows:

wherein the content of the first and second substances,

represents the target domain sample x _t Pseudo label predicted to correspond

K denotes the number of classes in the source domain, x _kt Representing a known class, x, in the target domain _unkt Representing unknown classes in the target domain; the threshold value P, based on the relative entropy of the known class probabilities from the target domain samples, is dynamically varied:

wherein, Sig represents a Sigmoid function,

representing the probability of a corresponding pseudo label of a known class;

inputting the samples of the source domain and the samples of the target domain which are identified as known classes into a feature generator G, and inputting the generated features into a domain label discriminator C _d The feature generator G maximizes the domain discrimination error L in order to obtain domain invariant features _d Domain label discriminator C _d Then the domain discrimination error L is minimized _d The formula is as follows:

wherein L is _bce Represents a two-class cross entropy loss, θ _g Network parameters, theta, representing a feature generator G _d Presentation Domain Label discriminator C _d Network parameter of (2), training sample x _i From a known class x in the source or target domain _kt ， y _d Represents the corresponding domain category label, 1 or 0;

updating the network parameter θ of the feature generator based on the gradient inversion layer and equation (5) _g Network parameter θ of sum-domain tag discriminator _d 。

Further, the reducing inter-domain differences between the source domain and the target domain based on the covariance matching includes: covariance matrices of a source domain and a target domain are respectively obtained:

wherein, F _S Representing the original source domain feature matrix, F _T Representing the original target domain feature matrix,

is represented by F _S The mean value of the corresponding column vector is,

is represented by F _T Corresponding column vector mean, n _S Denotes the number of source domain samples, n _T Representing the number of samples in the target domain;

using the Frobenius norm as a distance metric between covariance matrices of a source domain and a target domain, and aligning second-order statistical characteristics of distribution of the source domain and the target domain by using the Frobenius norm as a loss function:

wherein the content of the first and second substances,

representation matrix Cov _S -Cov _T The Frobenius norm of (d) represents the network output dimension.

In a second aspect, an open domain adaptive image classification system is provided, which includes a processor and a storage device, where the storage device stores a plurality of instructions for the processor to load and execute the steps of the method of the first aspect.

Compared with the prior art, the invention has the following beneficial effects:

(1) according to the method, a channel attention mechanism is utilized, so that the network can better acquire domain invariant features, the features are convenient to migrate, and the features are easier to train;

(2) the present invention uses a domain discriminator that combines images and features, with image tag information, to capture a multi-modal data structure for a domain;

(3) the distribution is matched by aligning second-order statistics (namely covariance) output by the network, so that the difference between the source domain and the target domain is further reduced, and the inter-domain migration is better realized.

Drawings

FIG. 1 is a block diagram of an open set domain adaptive image classification method according to an embodiment of the present invention;

FIG. 2 is a schematic illustration of an attention mechanism in an embodiment of the present invention;

FIG. 3 is a diagram illustrating the confrontation training of the domain discriminator according to an embodiment of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.

The first embodiment is as follows:

as shown in fig. 1 to 3, an open set domain adaptive image classification method inputs an acquired image into a trained open set domain adaptive image classification model to obtain an image category.

The training method of the open set domain self-adaptive image classification model comprises the following steps: respectively inputting labeled samples acquired from a source domain and unlabeled samples acquired from a target domain into a feature extractor based on a channel attention module to acquire a weighted multi-channel feature map; sending the weighted multi-channel feature map into a label classifier, dividing the labeled samples into K known classes, and dividing the unlabeled samples into K known classes which are visible in a source domain and an unknown class which is invisible in the source domain; sending the known category from the source domain, the known category from the target domain and the unknown category into a domain discriminator, and intensively distinguishing the sources of the known category and the unknown category based on the countermeasure generation network; based on the covariance match, the inter-domain difference between the source domain and the target domain is narrowed.

First step, channel attention-enhancing features: respectively inputting labeled samples acquired from a source domain and unlabeled samples acquired from a target domain into a feature extractor based on a channel attention module to acquire a weighted multi-channel feature map; inputting labeled samples and unlabeled samples into a convolutional neural network, and obtaining an original multichannel characteristic diagram through convolution operation; inputting the original multi-channel feature map into a channel attention module, and obtaining channel information through maximum pooling, average pooling and self-adjusting pooling; inputting channel information into an autonomous learning layer, and obtaining the weight of each channel through Sigmoid operation; and obtaining the weighted multi-channel feature map based on the weight of each channel.

In open set domain adaptive experiments, the source domain can be selected

In obtaining n _s Labeled samples, which belong to K categories; at the same time, the target domain can be selected

In obtaining n _t Unlabeled exemplars belonging to K +1 classes (1 means notKnown category). Wherein x is _si And x _ti Respectively representing the ith labeled image in the source domain and the ith unlabeled image in the target domain, and y _si Is a source domain image x _si Corresponding label information. Target domain D _t Class (D) includes class (D) that does not appear in the source domain _s The category (2). The present application refers to the class that occurs in both the source domain and the target domain as the "known class". Certain classes are called "unknown classes" because they only appear in the target domain and are not visible in the source domain.

The samples first enter a feature extractor consisting of a Convolutional Neural Network (CNN) and a fully connected layer. In the process of extracting the sample features, the embodiment adds a multi-information self-adaptive pooling channel attention module in the network. After the features of the image are extracted through the convolutional network, the weights of all feature channels can be automatically learned through the channel attention module, and then all the feature channels are weighted, so that key category information is strengthened, and interference information irrelevant to categories is suppressed.

The channel attention module proposed by the present embodiment combines self-adjusting pooling with maximum pooling and average pooling for obtaining more information. And (4) obtaining multi-channel feature maps after the samples are subjected to convolution operation, and sending the feature maps into a channel attention module. First, the channel attention module pools the multi-channel feature map. To obtain more valid information, the present embodiment divides the pooling operation into three parts: the maximum pooling is used to extract key features and the average pooling is used to extract global features. In addition to this, we add additional self-adjusting pooling for extracting detail features. It transforms a two-dimensional feature map into a 1 × 1 feature using two-dimensional convolution and the ReLU activation function.

And the channel information acquired through the pooling operation is sent to an autonomous learning layer, and the weight of each channel is acquired in a self-adaptive manner. Here, the present embodiment does not use the conventional full link layer, but uses the one-dimensional convolutional layer to automatically learn the weight. Therefore, network parameters can be greatly reduced, and the complexity of the model is reduced. Meanwhile, with the help of less parameters of the one-dimensional convolutional layer, the three pooling characteristics can be operated to train and weight independently, so that negative effects are avoided.

Finally, adding the numerical values obtained by the three one-dimensional convolutions, and generating a channel weight of 0-1 finally through Sigmoid operation; the formula is expressed as:

ω＝Sig{Conv1D[AVGPool(y)]+Conv1D[MAXPool(y)]+Conv1D[AdaPool(y)]} (1)

wherein, ω represents the channel weight, Sig represents Sigmoid function, Conv1D represents one-dimensional convolution operation, y represents the original multi-channel feature map, AVGPool represents average pooling, MAXPool represents maximum pooling, and AdaPool represents self-adjusting pooling.

The weighted multi-channel feature map y' is:

y'＝ω*y (1-1)。

the second step is that: identifying unknown classes: and feeding the weighted multi-channel feature map into a label classifier, dividing the labeled samples into K known classes, and dividing the unlabeled samples into K known classes which are visible in a source domain and an unknown class which is not visible in the source domain.

After the feature extractor of the first step, valuable features are obtained. Feeding the features into a label predictor, classifying the samples into (K +1) classes, where K represents the number of known classes and the (K +1) th class represents an unknown class existing only in the target domain, while utilizing p (x) _si |y _si )＝C{G(θ _g ,x _si ),θ _c }(θ _g Parameters, theta, representing the feature generator G _c Representation classifier C _y Parameters of (d).

First, the labeled samples in the source domain are correctly classified, at which time the classification loss function L in the source domain _S At a minimum, the corresponding formula is as follows:

wherein L is _y Represents the cross entropy loss of the standard, C _y Represents a classifier, | D _S And | represents the total number of source domain samples. Since the training sample comes fromThe source domain, and therefore the trainable model, identifies K categories. Furthermore, since the source domain sample classes are all known, the probability of the (K +1) th class should be zero.

Thereafter, a boundary is established between the known class and the unknown class in the target domain. Here, the probability of the unknown class is set as β for training the classifier, and the performance of the classifier is improved by training the feature generator. The final output can be regarded as a two-classification task, i.e. C _y The sum of the probabilities of all K known classes identified, and the probability of the (K +1) th class, i.e., the unknown class. The two game against each other through the confrontational training. Specifically, the generator increases the error of the classifier by increasing or decreasing the probability of the unknown class so that the output of the (K +1) class deviates from β, where the classifier needs to make the output probability of the (K +1) th class approach β to reduce the classifier error. If the generator chooses to increase the probability β of an unknown class, this means that the sample is identified as an unknown class, otherwise it is identified as a known class. Through the training of the label samples in the source domain in the last step, the network has certain discrimination capability, and the unknown classes can be correctly identified through multiple iterations. Thus, L _adv Using the two-class cross entropy loss, the formula is as follows:

wherein X _t Sample representing the target Domain, p (X) _t Y ═ K +1) represents sample X _t The probability of belonging to the (K +1) th class, β represents the probability of the unknown class, where β is set to 0.5 for the purpose of separating the unknown class samples in the target domain.

Thirdly, extracting the domain invariant features based on the confronted domain discriminator: known classes from the source domain and the target domain are fed into the domain discriminator and the network-hardened domain-invariant feature extraction is generated based on the countermeasure.

Since the distribution of sample features is different in different domains, it becomes crucial to extract common features of the two domains, which is also the key to achieve domain adaptation. The embodiment utilizes the idea of counterlearning, embeds the domain discriminator into a deep network, and realizes inter-domain migration through learning domain invariant features. Specifically, the challenge generation network trains the discriminator to recognize the authenticity of the input sample and spoofs the discriminator with the generator generating a false sample. For the domain self-adaptation problem, images of two different domains naturally exist, so that the link of generating false samples is omitted. I.e. the discriminator is used to distinguish whether the picture is from the source domain or the target domain, while the generator is used for feature extraction. The final purpose is to enable the feature extractor to obtain domain invariant features which cannot be accurately distinguished by the domain discriminator, thereby realizing inter-domain migration. Some conventional anti-domain adaptation methods have problems, such as underutilization of image tag information and alignment of features only. And may not be able to capture multi-modal data structures. Thus, the present embodiment further aligns the image classes by a joint distribution of the images and features. Multi-modal structures for capturing samples using simultaneous multi-linear mapping:

where f denotes a sample feature extracted by the feature extractor, and c denotes a sample label or a pseudo label. And the two are subjected to multi-linear mapping to obtain a combined multi-modal structure of the features and the labels. Because the output result c of the classifier contains potential discrimination information, the domain features f of the classifier are combined to perform feature alignment, and the method is favorable for capturing the multi-modal structure of the sample by the network.

If unknown classes in the target domain are not considered, the domain invariant features are directly extracted, finally, the feature distribution of the two domains is forced to be aligned, the mismatch between the known classes and the unknown classes can be caused, the performance of the model can be reduced, and negative migration is caused. Therefore, the target sample is screened to remove the unknown class. Specifically, first, the target sample X is obtained _t The probability of predicting into each class, i.e., class K + 1. If the class with the highest probability is the known class and the probability value is greater than the threshold value P, the sample is marked as the known class in the target domain, and the corresponding formula is as follows:

wherein the content of the first and second substances,

representing the target domain sample x _t Pseudo label predicted to correspond

K denotes the number of classes in the source domain, x _kt Representing a known class, x, in the target domain _unkt Representing unknown classes in the target domain; for the threshold P, the relative entropy of the known class probabilities from the target domain samples is used as a basis to make it dynamically change, adapting to changes in the data set and model:

wherein, Sig represents a Sigmoid function,

representing the probability of a pseudo-label corresponding to a known class, K being the total number of known classes.

In the early stage of training, because the model classification performance is weak and the certainty of sample identification is low, the relative entropy of the probability of the known class in the target domain is large, and the requirement for identification as the known class is high. With continuous optimization of the model, the accuracy and the certainty of prediction are increased, the relative entropy of the probability of the known class in the target domain is reduced, and therefore more target domain samples participate in the training of the domain identifier. And (5) performing iterative optimization in the whole training process.

Then, the samples of the source domain and the samples of the target domain which are identified as known classes are input into a feature generator G, and the generated features are input into a domain label discriminator C _d . To obtain domain invariant features, the feature generator G needs to maximize the domain discrimination error L _d Domain discriminator C _d Then the domain classification error L is minimized _d The formula is as follows:

wherein L is _bce Represents a two-class cross entropy loss, θ _g Network parameters, theta, representing a feature generator G _d Presentation Domain Label discriminator C _d Network parameter of (2), training sample x _i From a known class x in the source or target domain _kt ， y _d Indicates the corresponding domain category label, y _d 1 or 0;

updating the network parameter θ of the feature generator based on the gradient inversion layer and equation (5) _g Network parameter θ of sum-domain tag discriminator _d . Updating the parameter theta of the feature generator respectively by using the last two formulas in the formula (5) _g Parameter θ of sum-field discriminator _d . For efficient calculation of the gradients and updating of the parameters, a gradient inversion layer is used here. In the process of backward propagation, the loss gradient of the domain discriminator is automatically inverted before being reversely propagated to the parameters of the feature extractor, and the forward propagation has no influence, so that the countertraining is realized.

The fourth step: reducing inter-domain differences between source and target domains based on covariance matching

Samples of different domains typically have different covariance distributions, and the covariance matrix may represent the correlation of the matrix eigenvariables. Therefore, the present embodiment matches the distribution by aligning the second-order statistics (i.e. covariance) output by the network, so as to further reduce the difference between the source domain and the target domain, and better implement inter-domain migration.

Firstly, covariance matrixes of two domains are respectively obtained

is represented by F _S The mean value of the corresponding column vector is,

is represented by F _T Corresponding column vector mean, n _S Denotes the number of source domain samples, n _T Representing the number of target domain samples.

And secondly, using the Frobenius norm as the distance measurement between covariance matrixes of the source domain and the target domain, and aligning second-order statistical characteristics of distribution of the source domain and the target domain by using the distance measurement as a loss function:

wherein the content of the first and second substances,

representation matrix Cov _S -Cov _T The Frobenius norm (defined as the sum of squares of the elements of the matrix followed by the square), d represents the network output dimension. Finally, the second-order statistical difference of the samples can be obtained, and the inter-domain difference can be reduced by continuously optimizing and minimizing the loss, so that the inter-domain migration is realized.

During the training process, the loss is optimized together with the classification loss in equation (2). The inter-domain difference is reduced, overfitting of the model on a source domain data set is avoided, and the learned features are required to have strong discriminability, so that training of the classifier is facilitated.

The net final loss function can be expressed as:

L＝L _s (x _i ,y _i )+L _adv (x _t )+L _cov +L _d (x _i ,y _d ) (13)

the training targets for the network parameters are as follows:

in the embodiment, a channel attention mechanism is utilized, so that the network can better acquire domain invariant features, the migration of the features is facilitated, and the features are easier to train; capturing a multi-modal data structure of a domain using a domain discriminator that combines images and features, utilizing image tag information; the distribution is matched by aligning second-order statistics (namely covariance) output by the network, so that the difference between the source domain and the target domain is further reduced, and the inter-domain migration is better realized.

Example two:

based on the method for classifying an open set domain self-adaptive image according to the first embodiment, the present embodiment provides an open set domain self-adaptive image classification system, which includes a processor and a storage device, where the storage device stores a plurality of instructions, and the instructions are used for the processor to load and execute the steps of the method according to the first embodiment.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Claims

1. An open set domain self-adaptive image classification method is characterized in that collected images are input into a trained open set domain self-adaptive image classification model to obtain image classes;

the training method of the open set domain self-adaptive image classification model comprises the following steps:

respectively inputting labeled samples acquired from a source domain and unlabeled samples acquired from a target domain into a feature extractor based on a channel attention module to acquire a weighted multi-channel feature map;

sending the weighted multi-channel feature map into a label classifier, dividing the labeled samples into K known classes, and dividing the unlabeled samples into K known classes which are visible in a source domain and an unknown class which is invisible in the source domain;

sending the known categories from the source domain and the target domain into a domain discriminator, and generating network strengthening domain invariant feature extraction based on the countermeasure;

reducing inter-domain differences between the source domain and the target domain based on covariance matching;

the method comprises the following steps of sending known categories from a source domain and a target domain into a domain discriminator, and generating network strengthening domain invariant feature extraction based on countermeasures, wherein the method comprises the following steps:

obtaining a target sample X _t Predicting the probability of each category, if the category with the highest probability is a known category and the probability value is greater than a threshold value P, marking the sample as the known category in the target domain, wherein the corresponding formula is as follows:

wherein the content of the first and second substances,

representing the target domain sample x _t Pseudo label predicted to correspond

wherein, Sig represents a Sigmoid function,

representing the probability of a corresponding pseudo label of a known class;

inputting the samples of the source domain and the samples of the target domain which are identified as known classes into a feature generator G, and inputting the generated features into a domain label discriminator C _d Feature generator G max for obtaining domain invariant featuresDomain discrimination error L _d Domain label discriminator C _d Then the domain discrimination error L is minimized _d The formula is as follows:

wherein L is _bce Represents a two-class cross entropy loss, θ _g Network parameters, theta, representing a feature generator G _d Presentation Domain Label discriminator C _d Network parameter of (2), training sample x _i From a known class x in the source or target domain _kt ，y _d Represents the corresponding domain category label, 1 or 0;

2. The open set domain adaptive image classification method of claim 1, wherein the step of inputting the labeled samples obtained from the source domain and the unlabeled samples obtained from the target domain into the feature extractor based on the channel attention module to obtain the weighted multi-channel feature map comprises:

inputting the labeled sample and the unlabeled sample into a convolutional neural network, and performing convolution operation to obtain an original multi-channel characteristic diagram;

inputting the original multi-channel feature map into a channel attention module, and obtaining channel information through maximum pooling, average pooling and self-adjusting pooling;

inputting channel information into an autonomous learning layer, and acquiring the weight of each channel through Sigmoid operation:

ω＝Sig{Conv1D[AVGPool(y)]+Conv1D[MAXPool(y)]+Conv1D[AdaPool(y)]} (1)

y'＝ω*y (1-1)。

3. the open-domain adaptive image classification method according to claim 2, characterized in that the self-adjusting pooling transforms a two-dimensional feature map into a 1 x 1 feature using two-dimensional convolution and a ReLU activation function.

4. The open-domain adaptive image classification method according to claim 1, characterized in that in the process of classifying the labeled samples into K known classes, the classification loss function is:

5. The open set domain adaptive image classification method of claim 1, characterized in that in the said process of dividing unlabeled samples into K known classes visible in the source domain and one unknown class invisible in the source domain, a two-class cross entropy loss is used:

wherein X _t To show the eyesSample of the Domain, p (X) _t Y ═ K +1) represents sample X _t The probability of belonging to the (K +1) th class, and β represents the probability of an unknown class.

6. The open-set domain adaptive image classification method of claim 1, wherein the reducing inter-domain differences between the source domain and the target domain based on covariance matching comprises:

covariance matrices of a source domain and a target domain are respectively obtained:

is represented by F _S The mean value of the corresponding column vector is,

wherein the content of the first and second substances,

7. An open domain adaptive image classification system comprising a processor and a storage device, wherein the storage device stores a plurality of instructions for the processor to load and execute the steps of the method according to any one of claims 1 to 6.