CN113076994B - Open-set domain self-adaptive image classification method and system - Google Patents

Open-set domain self-adaptive image classification method and system Download PDF

Info

Publication number
CN113076994B
CN113076994B CN202110349864.4A CN202110349864A CN113076994B CN 113076994 B CN113076994 B CN 113076994B CN 202110349864 A CN202110349864 A CN 202110349864A CN 113076994 B CN113076994 B CN 113076994B
Authority
CN
China
Prior art keywords
domain
channel
source
samples
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110349864.4A
Other languages
Chinese (zh)
Other versions
CN113076994A (en
Inventor
张庆亮
朱松豪
梁志伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202110349864.4A priority Critical patent/CN113076994B/en
Publication of CN113076994A publication Critical patent/CN113076994A/en
Application granted granted Critical
Publication of CN113076994B publication Critical patent/CN113076994B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The invention discloses an open set domain self-adaptive image classification method and a system in the technical field of image recognition, which utilize a channel attention mechanism to enable a network to better acquire domain invariant features, facilitate the migration of the features and simultaneously enable the features to be easier to train. Respectively inputting labeled samples acquired from a source domain and unlabeled samples acquired from a target domain into a feature extractor based on a channel attention module to acquire a weighted multi-channel feature map; sending the weighted multi-channel feature map into a label classifier, dividing the labeled samples into K known classes, and dividing the unlabeled samples into K known classes which are visible in a source domain and an unknown class which is invisible in the source domain; sending the known categories from the source domain and the target domain into a domain discriminator, and generating network strengthening domain invariant feature extraction based on the countermeasure; based on the covariance match, the inter-domain difference between the source domain and the target domain is narrowed.

Description

Open-set domain self-adaptive image classification method and system
Technical Field
The invention belongs to the technical field of image recognition, and particularly relates to an open set domain self-adaptive image classification method and system.
Background
In recent years, with the development of deep learning techniques, computer vision has received a great deal of attention and has made great progress. Image classification is a classic problem in computer vision, and is widely applied to daily production and life, such as medical image recognition, face recognition, license plate recognition, remote sensing image classification and the like. Currently, the traditional depth model is mainly obtained by performing a large amount of learning on data in a specific learning scene. However, to construct such a good deep neural network, a large amount of labeled training data is often required. In some fields (such as medical images) with scarce labeling and strong specialization, the method is difficult to obtain.
Closed domain adaptation is the most basic and interesting class of settings, and the most basic assumption of this class of methods is that the source domain and the target domain share the same exact category. The performance of these methods is greatly degraded once the target domain contains classes that are not present in the source domain. An open set domain based adaptive algorithm. There are also some problems, such as insufficient alignment of source domain and target domain features, easy coupling of training, etc. In addition, the method also has the problems of long model training time, difficult convergence and the like.
Disclosure of Invention
In order to solve the defects in the prior art, the invention provides an open set domain self-adaptive image classification method and system, which utilize a channel attention mechanism to enable a network to better acquire domain invariant features, facilitate the migration of the features and simultaneously enable the features to be easier to train.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
in a first aspect, an open set domain self-adaptive image classification method is provided, in which an acquired image is input into a trained open set domain self-adaptive image classification model to obtain an image category.
Further, the training method of the open domain adaptive image classification model comprises the following steps: respectively inputting labeled samples acquired from a source domain and unlabeled samples acquired from a target domain into a feature extractor based on a channel attention module to acquire a weighted multi-channel feature map; sending the weighted multi-channel feature map into a label classifier, dividing the labeled samples into K known classes, and dividing the unlabeled samples into K known classes which are visible in a source domain and an unknown class which is invisible in the source domain; sending the known categories from the source domain and the target domain into a domain discriminator, and generating network strengthening domain invariant feature extraction based on the countermeasure; based on the covariance match, the inter-domain difference between the source domain and the target domain is narrowed.
Further, the step of inputting the labeled samples obtained from the source domain and the unlabeled samples obtained from the target domain into a feature extractor based on a channel attention module respectively to obtain a weighted multi-channel feature map includes: inputting the labeled samples and the unlabeled samples into a convolutional neural network, and obtaining an original multichannel characteristic diagram through convolution operation; inputting the original multi-channel feature map into a channel attention module, and obtaining channel information through maximum pooling, average pooling and self-adjusting pooling; inputting channel information into an autonomous learning layer, and acquiring the weight of each channel through Sigmoid operation:
ω=Sig{Conv1D[AVGPool(y)]+Conv1D[MAXPool(y)]+Conv1D[AdaPool(y)]} (1)
wherein, ω represents a channel weight, Sig represents a Sigmoid function, Conv1D represents a one-dimensional convolution operation, y represents an original multi-channel characteristic diagram, AVGPool represents average pooling, MAXPool represents maximum pooling, AdaPool represents self-adjusting pooling;
based on the weight of each channel, obtaining a weighted multi-channel feature map y' as follows:
y'=ω*y (1-1)。
further, the self-adjusting pooling transforms the two-dimensional feature map into a 1 × 1 feature using two-dimensional convolution and the ReLU activation function.
Further, in the process of dividing the labeled samples into K known classes, the classification loss function is:
Figure BDA0003001828150000031
wherein L is y Represents the cross entropy loss of the standard, C y Represents a classifier, | D S And | represents the total number of source domain samples.
Further, in the process of dividing the unlabeled exemplars into K known classes visible in the source domain and one unknown class invisible in the source domain, a two-class cross entropy loss is used:
Figure BDA0003001828150000032
wherein, X t Sample representing the target Domain, p (X) t Y ═ K +1) represents sample X t The probability of belonging to the (K +1) th class, and β represents the probability of an unknown class.
Further, the sending the known categories from the source domain and the target domain into the domain discriminator and the network strengthening domain invariant feature extraction based on the countermeasure generation comprise: obtaining a target sample X t Predicting the probability of each category, if the category with the highest probability is a known category and the probability value is greater than a threshold value P, marking the sample as the known category in the target domain, wherein the corresponding formula is as follows:
Figure BDA0003001828150000033
wherein the content of the first and second substances,
Figure BDA0003001828150000034
represents the target domain sample x t Pseudo label predicted to correspond
Figure BDA0003001828150000035
K denotes the number of classes in the source domain, x kt Representing a known class, x, in the target domain unkt Representing unknown classes in the target domain; the threshold value P, based on the relative entropy of the known class probabilities from the target domain samples, is dynamically varied:
Figure BDA0003001828150000036
wherein, Sig represents a Sigmoid function,
Figure BDA0003001828150000041
representing the probability of a corresponding pseudo label of a known class;
inputting the samples of the source domain and the samples of the target domain which are identified as known classes into a feature generator G, and inputting the generated features into a domain label discriminator C d The feature generator G maximizes the domain discrimination error L in order to obtain domain invariant features d Domain label discriminator C d Then the domain discrimination error L is minimized d The formula is as follows:
Figure BDA0003001828150000042
Figure BDA0003001828150000043
Figure BDA0003001828150000044
wherein L is bce Represents a two-class cross entropy loss, θ g Network parameters, theta, representing a feature generator G d Presentation Domain Label discriminator C d Network parameter of (2), training sample x i From a known class x in the source or target domain kt , y d Represents the corresponding domain category label, 1 or 0;
updating the network parameter θ of the feature generator based on the gradient inversion layer and equation (5) g Network parameter θ of sum-domain tag discriminator d
Further, the reducing inter-domain differences between the source domain and the target domain based on the covariance matching includes: covariance matrices of a source domain and a target domain are respectively obtained:
Figure BDA0003001828150000045
Figure BDA0003001828150000046
wherein, F S Representing the original source domain feature matrix, F T Representing the original target domain feature matrix,
Figure BDA0003001828150000047
is represented by F S The mean value of the corresponding column vector is,
Figure BDA0003001828150000048
is represented by F T Corresponding column vector mean, n S Denotes the number of source domain samples, n T Representing the number of samples in the target domain;
using the Frobenius norm as a distance metric between covariance matrices of a source domain and a target domain, and aligning second-order statistical characteristics of distribution of the source domain and the target domain by using the Frobenius norm as a loss function:
Figure BDA0003001828150000051
wherein the content of the first and second substances,
Figure BDA0003001828150000052
representation matrix Cov S -Cov T The Frobenius norm of (d) represents the network output dimension.
In a second aspect, an open domain adaptive image classification system is provided, which includes a processor and a storage device, where the storage device stores a plurality of instructions for the processor to load and execute the steps of the method of the first aspect.
Compared with the prior art, the invention has the following beneficial effects:
(1) according to the method, a channel attention mechanism is utilized, so that the network can better acquire domain invariant features, the features are convenient to migrate, and the features are easier to train;
(2) the present invention uses a domain discriminator that combines images and features, with image tag information, to capture a multi-modal data structure for a domain;
(3) the distribution is matched by aligning second-order statistics (namely covariance) output by the network, so that the difference between the source domain and the target domain is further reduced, and the inter-domain migration is better realized.
Drawings
FIG. 1 is a block diagram of an open set domain adaptive image classification method according to an embodiment of the present invention;
FIG. 2 is a schematic illustration of an attention mechanism in an embodiment of the present invention;
FIG. 3 is a diagram illustrating the confrontation training of the domain discriminator according to an embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
The first embodiment is as follows:
as shown in fig. 1 to 3, an open set domain adaptive image classification method inputs an acquired image into a trained open set domain adaptive image classification model to obtain an image category.
The training method of the open set domain self-adaptive image classification model comprises the following steps: respectively inputting labeled samples acquired from a source domain and unlabeled samples acquired from a target domain into a feature extractor based on a channel attention module to acquire a weighted multi-channel feature map; sending the weighted multi-channel feature map into a label classifier, dividing the labeled samples into K known classes, and dividing the unlabeled samples into K known classes which are visible in a source domain and an unknown class which is invisible in the source domain; sending the known category from the source domain, the known category from the target domain and the unknown category into a domain discriminator, and intensively distinguishing the sources of the known category and the unknown category based on the countermeasure generation network; based on the covariance match, the inter-domain difference between the source domain and the target domain is narrowed.
First step, channel attention-enhancing features: respectively inputting labeled samples acquired from a source domain and unlabeled samples acquired from a target domain into a feature extractor based on a channel attention module to acquire a weighted multi-channel feature map; inputting labeled samples and unlabeled samples into a convolutional neural network, and obtaining an original multichannel characteristic diagram through convolution operation; inputting the original multi-channel feature map into a channel attention module, and obtaining channel information through maximum pooling, average pooling and self-adjusting pooling; inputting channel information into an autonomous learning layer, and obtaining the weight of each channel through Sigmoid operation; and obtaining the weighted multi-channel feature map based on the weight of each channel.
In open set domain adaptive experiments, the source domain can be selected
Figure BDA0003001828150000061
In obtaining n s Labeled samples, which belong to K categories; at the same time, the target domain can be selected
Figure BDA0003001828150000062
In obtaining n t Unlabeled exemplars belonging to K +1 classes (1 means notKnown category). Wherein x is si And x ti Respectively representing the ith labeled image in the source domain and the ith unlabeled image in the target domain, and y si Is a source domain image x si Corresponding label information. Target domain D t Class (D) includes class (D) that does not appear in the source domain s The category (2). The present application refers to the class that occurs in both the source domain and the target domain as the "known class". Certain classes are called "unknown classes" because they only appear in the target domain and are not visible in the source domain.
The samples first enter a feature extractor consisting of a Convolutional Neural Network (CNN) and a fully connected layer. In the process of extracting the sample features, the embodiment adds a multi-information self-adaptive pooling channel attention module in the network. After the features of the image are extracted through the convolutional network, the weights of all feature channels can be automatically learned through the channel attention module, and then all the feature channels are weighted, so that key category information is strengthened, and interference information irrelevant to categories is suppressed.
The channel attention module proposed by the present embodiment combines self-adjusting pooling with maximum pooling and average pooling for obtaining more information. And (4) obtaining multi-channel feature maps after the samples are subjected to convolution operation, and sending the feature maps into a channel attention module. First, the channel attention module pools the multi-channel feature map. To obtain more valid information, the present embodiment divides the pooling operation into three parts: the maximum pooling is used to extract key features and the average pooling is used to extract global features. In addition to this, we add additional self-adjusting pooling for extracting detail features. It transforms a two-dimensional feature map into a 1 × 1 feature using two-dimensional convolution and the ReLU activation function.
And the channel information acquired through the pooling operation is sent to an autonomous learning layer, and the weight of each channel is acquired in a self-adaptive manner. Here, the present embodiment does not use the conventional full link layer, but uses the one-dimensional convolutional layer to automatically learn the weight. Therefore, network parameters can be greatly reduced, and the complexity of the model is reduced. Meanwhile, with the help of less parameters of the one-dimensional convolutional layer, the three pooling characteristics can be operated to train and weight independently, so that negative effects are avoided.
Finally, adding the numerical values obtained by the three one-dimensional convolutions, and generating a channel weight of 0-1 finally through Sigmoid operation; the formula is expressed as:
ω=Sig{Conv1D[AVGPool(y)]+Conv1D[MAXPool(y)]+Conv1D[AdaPool(y)]} (1)
wherein, ω represents the channel weight, Sig represents Sigmoid function, Conv1D represents one-dimensional convolution operation, y represents the original multi-channel feature map, AVGPool represents average pooling, MAXPool represents maximum pooling, and AdaPool represents self-adjusting pooling.
The weighted multi-channel feature map y' is:
y'=ω*y (1-1)。
the second step is that: identifying unknown classes: and feeding the weighted multi-channel feature map into a label classifier, dividing the labeled samples into K known classes, and dividing the unlabeled samples into K known classes which are visible in a source domain and an unknown class which is not visible in the source domain.
After the feature extractor of the first step, valuable features are obtained. Feeding the features into a label predictor, classifying the samples into (K +1) classes, where K represents the number of known classes and the (K +1) th class represents an unknown class existing only in the target domain, while utilizing p (x) si |y si )=C{G(θ g ,x si ),θ c }(θ g Parameters, theta, representing the feature generator G c Representation classifier C y Parameters of (d).
First, the labeled samples in the source domain are correctly classified, at which time the classification loss function L in the source domain S At a minimum, the corresponding formula is as follows:
Figure BDA0003001828150000081
wherein L is y Represents the cross entropy loss of the standard, C y Represents a classifier, | D S And | represents the total number of source domain samples. Since the training sample comes fromThe source domain, and therefore the trainable model, identifies K categories. Furthermore, since the source domain sample classes are all known, the probability of the (K +1) th class should be zero.
Thereafter, a boundary is established between the known class and the unknown class in the target domain. Here, the probability of the unknown class is set as β for training the classifier, and the performance of the classifier is improved by training the feature generator. The final output can be regarded as a two-classification task, i.e. C y The sum of the probabilities of all K known classes identified, and the probability of the (K +1) th class, i.e., the unknown class. The two game against each other through the confrontational training. Specifically, the generator increases the error of the classifier by increasing or decreasing the probability of the unknown class so that the output of the (K +1) class deviates from β, where the classifier needs to make the output probability of the (K +1) th class approach β to reduce the classifier error. If the generator chooses to increase the probability β of an unknown class, this means that the sample is identified as an unknown class, otherwise it is identified as a known class. Through the training of the label samples in the source domain in the last step, the network has certain discrimination capability, and the unknown classes can be correctly identified through multiple iterations. Thus, L adv Using the two-class cross entropy loss, the formula is as follows:
Figure BDA0003001828150000091
wherein X t Sample representing the target Domain, p (X) t Y ═ K +1) represents sample X t The probability of belonging to the (K +1) th class, β represents the probability of the unknown class, where β is set to 0.5 for the purpose of separating the unknown class samples in the target domain.
Thirdly, extracting the domain invariant features based on the confronted domain discriminator: known classes from the source domain and the target domain are fed into the domain discriminator and the network-hardened domain-invariant feature extraction is generated based on the countermeasure.
Since the distribution of sample features is different in different domains, it becomes crucial to extract common features of the two domains, which is also the key to achieve domain adaptation. The embodiment utilizes the idea of counterlearning, embeds the domain discriminator into a deep network, and realizes inter-domain migration through learning domain invariant features. Specifically, the challenge generation network trains the discriminator to recognize the authenticity of the input sample and spoofs the discriminator with the generator generating a false sample. For the domain self-adaptation problem, images of two different domains naturally exist, so that the link of generating false samples is omitted. I.e. the discriminator is used to distinguish whether the picture is from the source domain or the target domain, while the generator is used for feature extraction. The final purpose is to enable the feature extractor to obtain domain invariant features which cannot be accurately distinguished by the domain discriminator, thereby realizing inter-domain migration. Some conventional anti-domain adaptation methods have problems, such as underutilization of image tag information and alignment of features only. And may not be able to capture multi-modal data structures. Thus, the present embodiment further aligns the image classes by a joint distribution of the images and features. Multi-modal structures for capturing samples using simultaneous multi-linear mapping:
Figure BDA0003001828150000101
where f denotes a sample feature extracted by the feature extractor, and c denotes a sample label or a pseudo label. And the two are subjected to multi-linear mapping to obtain a combined multi-modal structure of the features and the labels. Because the output result c of the classifier contains potential discrimination information, the domain features f of the classifier are combined to perform feature alignment, and the method is favorable for capturing the multi-modal structure of the sample by the network.
If unknown classes in the target domain are not considered, the domain invariant features are directly extracted, finally, the feature distribution of the two domains is forced to be aligned, the mismatch between the known classes and the unknown classes can be caused, the performance of the model can be reduced, and negative migration is caused. Therefore, the target sample is screened to remove the unknown class. Specifically, first, the target sample X is obtained t The probability of predicting into each class, i.e., class K + 1. If the class with the highest probability is the known class and the probability value is greater than the threshold value P, the sample is marked as the known class in the target domain, and the corresponding formula is as follows:
Figure BDA0003001828150000102
wherein the content of the first and second substances,
Figure BDA0003001828150000103
representing the target domain sample x t Pseudo label predicted to correspond
Figure BDA0003001828150000104
K denotes the number of classes in the source domain, x kt Representing a known class, x, in the target domain unkt Representing unknown classes in the target domain; for the threshold P, the relative entropy of the known class probabilities from the target domain samples is used as a basis to make it dynamically change, adapting to changes in the data set and model:
Figure BDA0003001828150000105
wherein, Sig represents a Sigmoid function,
Figure BDA0003001828150000106
representing the probability of a pseudo-label corresponding to a known class, K being the total number of known classes.
In the early stage of training, because the model classification performance is weak and the certainty of sample identification is low, the relative entropy of the probability of the known class in the target domain is large, and the requirement for identification as the known class is high. With continuous optimization of the model, the accuracy and the certainty of prediction are increased, the relative entropy of the probability of the known class in the target domain is reduced, and therefore more target domain samples participate in the training of the domain identifier. And (5) performing iterative optimization in the whole training process.
Then, the samples of the source domain and the samples of the target domain which are identified as known classes are input into a feature generator G, and the generated features are input into a domain label discriminator C d . To obtain domain invariant features, the feature generator G needs to maximize the domain discrimination error L d Domain discriminator C d Then the domain classification error L is minimized d The formula is as follows:
Figure BDA0003001828150000111
Figure BDA0003001828150000112
Figure BDA0003001828150000113
wherein L is bce Represents a two-class cross entropy loss, θ g Network parameters, theta, representing a feature generator G d Presentation Domain Label discriminator C d Network parameter of (2), training sample x i From a known class x in the source or target domain kt , y d Indicates the corresponding domain category label, y d 1 or 0;
updating the network parameter θ of the feature generator based on the gradient inversion layer and equation (5) g Network parameter θ of sum-domain tag discriminator d . Updating the parameter theta of the feature generator respectively by using the last two formulas in the formula (5) g Parameter θ of sum-field discriminator d . For efficient calculation of the gradients and updating of the parameters, a gradient inversion layer is used here. In the process of backward propagation, the loss gradient of the domain discriminator is automatically inverted before being reversely propagated to the parameters of the feature extractor, and the forward propagation has no influence, so that the countertraining is realized.
The fourth step: reducing inter-domain differences between source and target domains based on covariance matching
Samples of different domains typically have different covariance distributions, and the covariance matrix may represent the correlation of the matrix eigenvariables. Therefore, the present embodiment matches the distribution by aligning the second-order statistics (i.e. covariance) output by the network, so as to further reduce the difference between the source domain and the target domain, and better implement inter-domain migration.
Firstly, covariance matrixes of two domains are respectively obtained
Figure BDA0003001828150000121
Figure BDA0003001828150000122
Wherein, F S Representing the original source domain feature matrix, F T Representing the original target domain feature matrix,
Figure BDA0003001828150000123
is represented by F S The mean value of the corresponding column vector is,
Figure BDA0003001828150000124
is represented by F T Corresponding column vector mean, n S Denotes the number of source domain samples, n T Representing the number of target domain samples.
And secondly, using the Frobenius norm as the distance measurement between covariance matrixes of the source domain and the target domain, and aligning second-order statistical characteristics of distribution of the source domain and the target domain by using the distance measurement as a loss function:
Figure BDA0003001828150000125
wherein the content of the first and second substances,
Figure BDA0003001828150000126
representation matrix Cov S -Cov T The Frobenius norm (defined as the sum of squares of the elements of the matrix followed by the square), d represents the network output dimension. Finally, the second-order statistical difference of the samples can be obtained, and the inter-domain difference can be reduced by continuously optimizing and minimizing the loss, so that the inter-domain migration is realized.
During the training process, the loss is optimized together with the classification loss in equation (2). The inter-domain difference is reduced, overfitting of the model on a source domain data set is avoided, and the learned features are required to have strong discriminability, so that training of the classifier is facilitated.
The net final loss function can be expressed as:
L=L s (x i ,y i )+L adv (x t )+L cov +L d (x i ,y d ) (13)
the training targets for the network parameters are as follows:
Figure BDA0003001828150000127
in the embodiment, a channel attention mechanism is utilized, so that the network can better acquire domain invariant features, the migration of the features is facilitated, and the features are easier to train; capturing a multi-modal data structure of a domain using a domain discriminator that combines images and features, utilizing image tag information; the distribution is matched by aligning second-order statistics (namely covariance) output by the network, so that the difference between the source domain and the target domain is further reduced, and the inter-domain migration is better realized.
Example two:
based on the method for classifying an open set domain self-adaptive image according to the first embodiment, the present embodiment provides an open set domain self-adaptive image classification system, which includes a processor and a storage device, where the storage device stores a plurality of instructions, and the instructions are used for the processor to load and execute the steps of the method according to the first embodiment.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Claims (7)

1. An open set domain self-adaptive image classification method is characterized in that collected images are input into a trained open set domain self-adaptive image classification model to obtain image classes;
the training method of the open set domain self-adaptive image classification model comprises the following steps:
respectively inputting labeled samples acquired from a source domain and unlabeled samples acquired from a target domain into a feature extractor based on a channel attention module to acquire a weighted multi-channel feature map;
sending the weighted multi-channel feature map into a label classifier, dividing the labeled samples into K known classes, and dividing the unlabeled samples into K known classes which are visible in a source domain and an unknown class which is invisible in the source domain;
sending the known categories from the source domain and the target domain into a domain discriminator, and generating network strengthening domain invariant feature extraction based on the countermeasure;
reducing inter-domain differences between the source domain and the target domain based on covariance matching;
the method comprises the following steps of sending known categories from a source domain and a target domain into a domain discriminator, and generating network strengthening domain invariant feature extraction based on countermeasures, wherein the method comprises the following steps:
obtaining a target sample X t Predicting the probability of each category, if the category with the highest probability is a known category and the probability value is greater than a threshold value P, marking the sample as the known category in the target domain, wherein the corresponding formula is as follows:
Figure FDA0003688465790000011
wherein the content of the first and second substances,
Figure FDA0003688465790000012
representing the target domain sample x t Pseudo label predicted to correspond
Figure FDA0003688465790000013
K denotes the number of classes in the source domain, x kt Representing a known class, x, in the target domain unkt Representing unknown classes in the target domain; the threshold value P, based on the relative entropy of the known class probabilities from the target domain samples, is dynamically varied:
Figure FDA0003688465790000021
wherein, Sig represents a Sigmoid function,
Figure FDA0003688465790000022
representing the probability of a corresponding pseudo label of a known class;
inputting the samples of the source domain and the samples of the target domain which are identified as known classes into a feature generator G, and inputting the generated features into a domain label discriminator C d Feature generator G max for obtaining domain invariant featuresDomain discrimination error L d Domain label discriminator C d Then the domain discrimination error L is minimized d The formula is as follows:
Figure FDA0003688465790000023
Figure FDA0003688465790000024
Figure FDA0003688465790000025
wherein L is bce Represents a two-class cross entropy loss, θ g Network parameters, theta, representing a feature generator G d Presentation Domain Label discriminator C d Network parameter of (2), training sample x i From a known class x in the source or target domain kt ,y d Represents the corresponding domain category label, 1 or 0;
updating the network parameter θ of the feature generator based on the gradient inversion layer and equation (5) g Network parameter θ of sum-domain tag discriminator d
2. The open set domain adaptive image classification method of claim 1, wherein the step of inputting the labeled samples obtained from the source domain and the unlabeled samples obtained from the target domain into the feature extractor based on the channel attention module to obtain the weighted multi-channel feature map comprises:
inputting the labeled sample and the unlabeled sample into a convolutional neural network, and performing convolution operation to obtain an original multi-channel characteristic diagram;
inputting the original multi-channel feature map into a channel attention module, and obtaining channel information through maximum pooling, average pooling and self-adjusting pooling;
inputting channel information into an autonomous learning layer, and acquiring the weight of each channel through Sigmoid operation:
ω=Sig{Conv1D[AVGPool(y)]+Conv1D[MAXPool(y)]+Conv1D[AdaPool(y)]} (1)
wherein, ω represents a channel weight, Sig represents a Sigmoid function, Conv1D represents a one-dimensional convolution operation, y represents an original multi-channel characteristic diagram, AVGPool represents average pooling, MAXPool represents maximum pooling, AdaPool represents self-adjusting pooling;
based on the weight of each channel, obtaining a weighted multi-channel feature map y' as follows:
y'=ω*y (1-1)。
3. the open-domain adaptive image classification method according to claim 2, characterized in that the self-adjusting pooling transforms a two-dimensional feature map into a 1 x 1 feature using two-dimensional convolution and a ReLU activation function.
4. The open-domain adaptive image classification method according to claim 1, characterized in that in the process of classifying the labeled samples into K known classes, the classification loss function is:
Figure FDA0003688465790000031
wherein L is y Represents the cross entropy loss of the standard, C y Represents a classifier, | D S And | represents the total number of source domain samples.
5. The open set domain adaptive image classification method of claim 1, characterized in that in the said process of dividing unlabeled samples into K known classes visible in the source domain and one unknown class invisible in the source domain, a two-class cross entropy loss is used:
Figure FDA0003688465790000032
wherein X t To show the eyesSample of the Domain, p (X) t Y ═ K +1) represents sample X t The probability of belonging to the (K +1) th class, and β represents the probability of an unknown class.
6. The open-set domain adaptive image classification method of claim 1, wherein the reducing inter-domain differences between the source domain and the target domain based on covariance matching comprises:
covariance matrices of a source domain and a target domain are respectively obtained:
Figure FDA0003688465790000041
Figure FDA0003688465790000042
wherein, F S Representing the original source domain feature matrix, F T Representing the original target domain feature matrix,
Figure FDA0003688465790000043
is represented by F S The mean value of the corresponding column vector is,
Figure FDA0003688465790000044
is represented by F T Corresponding column vector mean, n S Denotes the number of source domain samples, n T Representing the number of samples in the target domain;
using the Frobenius norm as a distance metric between covariance matrices of a source domain and a target domain, and aligning second-order statistical characteristics of distribution of the source domain and the target domain by using the Frobenius norm as a loss function:
Figure FDA0003688465790000045
wherein the content of the first and second substances,
Figure FDA0003688465790000046
representation matrix Cov S -Cov T The Frobenius norm of (d) represents the network output dimension.
7. An open domain adaptive image classification system comprising a processor and a storage device, wherein the storage device stores a plurality of instructions for the processor to load and execute the steps of the method according to any one of claims 1 to 6.
CN202110349864.4A 2021-03-31 2021-03-31 Open-set domain self-adaptive image classification method and system Active CN113076994B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110349864.4A CN113076994B (en) 2021-03-31 2021-03-31 Open-set domain self-adaptive image classification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110349864.4A CN113076994B (en) 2021-03-31 2021-03-31 Open-set domain self-adaptive image classification method and system

Publications (2)

Publication Number Publication Date
CN113076994A CN113076994A (en) 2021-07-06
CN113076994B true CN113076994B (en) 2022-08-05

Family

ID=76614188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110349864.4A Active CN113076994B (en) 2021-03-31 2021-03-31 Open-set domain self-adaptive image classification method and system

Country Status (1)

Country Link
CN (1) CN113076994B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673570A (en) * 2021-07-21 2021-11-19 南京旭锐软件科技有限公司 Training method, device and equipment for electronic device picture classification model
CN113283404B (en) * 2021-07-22 2021-11-30 新石器慧通(北京)科技有限公司 Pedestrian attribute identification method and device, electronic equipment and storage medium
CN113807243B (en) * 2021-09-16 2023-12-05 上海交通大学 Water obstacle detection system and method based on attention to unknown target
CN114581334A (en) * 2022-03-17 2022-06-03 湖南大学 Self-adjusting text image generation method based on generation of confrontation network
CN114781554B (en) * 2022-06-21 2022-09-20 华东交通大学 Open set identification method and system based on small sample condition
CN115035463B (en) * 2022-08-09 2023-01-17 阿里巴巴(中国)有限公司 Behavior recognition method, behavior recognition device, behavior recognition equipment and storage medium
CN115392326B (en) * 2022-10-27 2024-03-19 中国人民解放军国防科技大学 Modulation identification method based on joint multi-modal information and domain countermeasure neural network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750665A (en) * 2019-10-12 2020-02-04 南京邮电大学 Open set domain adaptation method and system based on entropy minimization
CN112307914A (en) * 2020-10-20 2021-02-02 西北工业大学 Open domain image content identification method based on text information guidance

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750665A (en) * 2019-10-12 2020-02-04 南京邮电大学 Open set domain adaptation method and system based on entropy minimization
CN112307914A (en) * 2020-10-20 2021-02-02 西北工业大学 Open domain image content identification method based on text information guidance

Also Published As

Publication number Publication date
CN113076994A (en) 2021-07-06

Similar Documents

Publication Publication Date Title
CN113076994B (en) Open-set domain self-adaptive image classification method and system
CN109949317B (en) Semi-supervised image example segmentation method based on gradual confrontation learning
CN111738315B (en) Image classification method based on countermeasure fusion multi-source transfer learning
Shen et al. Generative adversarial learning towards fast weakly supervised detection
WO2021134871A1 (en) Forensics method for synthesized face image based on local binary pattern and deep learning
CN112446423B (en) Fast hybrid high-order attention domain confrontation network method based on transfer learning
CN113011357B (en) Depth fake face video positioning method based on space-time fusion
CN109063649B (en) Pedestrian re-identification method based on twin pedestrian alignment residual error network
CN111753881B (en) Concept sensitivity-based quantitative recognition defending method against attacks
CN113806746B (en) Malicious code detection method based on improved CNN (CNN) network
CN110827265B (en) Image anomaly detection method based on deep learning
CN110175248B (en) Face image retrieval method and device based on deep learning and Hash coding
CN112085055A (en) Black box attack method based on migration model Jacobian array feature vector disturbance
CN113221848B (en) Hyperspectral open set field self-adaption method based on multi-classifier domain confrontation network
CN112232395B (en) Semi-supervised image classification method for generating countermeasure network based on joint training
CN109472733A (en) Image latent writing analysis method based on convolutional neural networks
CN113988180A (en) Model fingerprint-based generated image tracing method
CN113011513B (en) Image big data classification method based on general domain self-adaption
CN117036904A (en) Attention-guided semi-supervised corn hyperspectral image data expansion method
CN116721343A (en) Cross-domain field cotton boll recognition method based on deep convolutional neural network
CN112487927B (en) Method and system for realizing indoor scene recognition based on object associated attention
CN112381149B (en) Reasonable countermeasure analysis method for source camera identification based on deep learning
CN115082762A (en) Target detection unsupervised domain adaptation system based on regional recommendation network center alignment
CN113343770A (en) Face anti-counterfeiting method based on feature screening
CN110555342B (en) Image identification method and device and image equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant