CN115546567B

CN115546567B - Unsupervised domain adaptive classification method, system, equipment and storage medium

Info

Publication number: CN115546567B
Application number: CN202211528066.9A
Authority: CN
Inventors: 徐行; 田加林; 沈复民; 申恒涛
Original assignee: Chengdu Koala Youran Technology Co ltd
Current assignee: Chengdu Koala Youran Technology Co ltd
Priority date: 2022-12-01
Filing date: 2022-12-01
Publication date: 2023-04-28
Anticipated expiration: 2042-12-01
Also published as: CN115546567A

Abstract

The invention relates to the field of deep learning, in particular to an unsupervised field adaptive classification method, a system, equipment and a storage medium; the method comprises the steps of constructing a deep convolution network model, training a source domain model according to labeled source domain data, constructing a target domain model according to the source domain model, training the target domain model to execute domain adaptation, performing unsupervised learning according to unlabeled target domain data, extracting characteristics and classification probability of the target domain data, and iteratively improving the image classification capability of the target domain model by calculating neighbor alignment loss, regular loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss, so as to eliminate the problem of misclassification caused by domain offset and class imbalance, thereby realizing unsupervised domain adaptation image classification.

Description

Unsupervised domain adaptive classification method, system, equipment and storage medium

Technical Field

The invention relates to the field of deep learning, in particular to an unsupervised field adaptive classification method, an unsupervised field adaptive classification system, unsupervised field adaptive classification equipment and a storage medium.

Background

Deep neural networks have become benchmark models for a variety of tasks at the cost of extensive data annotation that is time and effort consuming. However, with the breakthrough development of digital devices and online applications, manual annotation of continuously expanding multimedia data has become impossible. To avoid high labeling costs, unsupervised domain adaptations have been developed to take advantage of previously labeled source domain data to improve the performance of models on unlabeled target domain data.

The mainstream unsupervised domain adaptation method aims at learning domain invariant features through moment matching or countertraining. While unsupervised domain adaptation has recently made a pleasing progression, in a realistic scenario, the assumption of simultaneous access to source domain data and target domain data may not be true. The limitation of access may be due to storage limitations and data privacy limitations, particularly as the amount of multimedia data stored increases, the transmission of which is in turn limited by law and privacy policies of the data provider. Accordingly, passive domain adaptation is addressed herein, using only the data of the target domain to adapt the pre-trained source domain model. The passive domain adaptation method does not require access to the source domain data after the source model training is completed, which eliminates the storage and transmission consumption of large-scale data, and does not violate data constraints. However, existing passive domain adaptation does not take into account two major features of unlabeled target domain data: firstly, whether the target domain data is aligned with the classifier or not, the target domain data forms clusters in a feature space; secondly, the target domain sample with higher confidence is more reliable, and the classification confidence variation is smaller in the domain adaptation process. In addition, the existing passive domain adaptation method is difficult to process data with class imbalance, namely, data with large sample number difference among classes in the same domain and data with large sample number difference among the classes in different domains. For example, source fields class 1 and class 2 have 1000 and 10 samples, respectively, while target fields class 1 and class 2 have 10 and 1000 samples, respectively.

In summary, the existing field adaptive classification method does not consider two characteristics of the unlabeled target field data, and is difficult to process unbalanced data.

Disclosure of Invention

Aiming at the problems that the application setting of the existing unsupervised field adaptation method is unreasonable and unbalanced data is difficult to process, the invention provides an unsupervised field adaptation classification method, a system, equipment and a storage medium.

The invention has the following specific implementation contents:

an unsupervised domain adaptation classification method comprises the following steps:

Step S1: acquiring training data, collecting a batch of labeled source field image data, and collecting a batch of unlabeled target field image data;

step S2: constructing a deep convolution network model, and performing an image classification task based on the data in the active label source field, wherein the deep convolution network consists of a feature extractor and a linear classifier; the image classification task objective function is the minimum cross entropy, and a model obtained through training is called a source domain model;

step S3: based on the source domain model, a deep convolution network model, called a target domain model, is constructed, the target domain model is initialized by parameters of the source domain model, and the target domain model is trained to execute domain adaptation;

step S4: performing unsupervised learning based on unlabeled target domain data, extracting characteristics and classification probability of the target domain data, calculating neighbor alignment loss based on the characteristics of the target domain data, calculating regular loss and dispersion loss based on the classification probability of the target domain data, dividing a target domain data set based on the classification probability of the target domain data, dividing a high-confidence sample into a reliable set, dividing a low-confidence sample into a weak reliable set, and calculating cross-view alignment loss and cross-view neighbor alignment loss based on the classification probabilities of the high-confidence sample and the low-confidence sample;

Step S5: and iteratively updating the target domain model based on the neighbor alignment loss, the regular loss, the dispersion loss, the cross-view alignment loss and the cross-view neighbor alignment loss, and taking the trained target domain model as an image classification model of the label-free target domain data.

In order to better realize the invention, further, the constructed source domain convolution network and the target domain convolution network are composed of a feature extractor and a linear classifier; the features extracted by the feature extractor are subjected to L2 regularization, the parameter vectors of the linear classifier are subjected to L2 regularization, the calculated classification probability cannot deviate to specific classes at the moment, class boundary division is more balanced, the influence of class label distribution difference in the same domain is reduced, and the influence of class label distribution difference between the source domain and the target domain is also reduced.

In order to better realize the invention, further, based on the source domain model, a depth convolution network is constructed as a target domain model, the target domain model is initialized by the parameters of the source domain model, and the target domain is trained to execute domain adaptation; different from the previous unsupervised domain adaptive image classification method, the source domain model and the target domain model are obtained by training data in different domains, and the source domain model is only used as an initialization model of the target domain model; the past methods require the simultaneous acquisition of tagged source domain data and untagged target domain data and training of a target domain model based on both data.

In order to better realize the invention, further, based on the non-label target domain data, performing non-supervision learning, extracting the characteristics and the classification probability of the target domain data, and establishing two characteristic queues and classification probability vector queues with fixed capacity; the updating mode of the two queues is that in each training iteration of the target domain model, a batch of data is randomly sampled, new characteristics and classification probability are calculated for the batch of data, and the data are put into the corresponding positions of the queues according to the index of the batch of data.

In order to better realize the invention, further, based on the characteristics of the target domain data, searching the neighbor characteristics of each sample, taking the classification probability of the neighbor as a supervision signal, and calculating the neighbor alignment loss; the aim of the neighbor alignment loss is to keep the classification probability of each sample and the neighbor thereof consistent by utilizing the phenomenon that similar samples tend to be clustered in a feature space, so as to maintain and improve the aggregation degree and classification accuracy of the similar samples.

To better implement the present invention, further, a canonical penalty is calculated based on the classification probability of the target domain data; the regular loss is aimed at eliminating the influence of potential noise in the neighbor alignment loss and strengthening the certainty degree of the classification probability of each sample itself.

To better implement the present invention, further, a dispersion loss is calculated based on the classification probability of the target domain data; the dispersion loss aims to prevent the classifier from simply classifying all samples into fixed classes, avoid certain classes with a small number of samples from being completely ignored, and help to improve the classification accuracy of the classes conforming to long tail distribution.

In order to better realize the invention, further, the target domain data set is divided based on the classification probability of the target domain data, the high-confidence sample is divided into a reliable set, and the low-confidence sample is divided into a weak reliable set; the reliable samples and the weak reliable sample sets are divided, and the aim is to explore the value of the reliable samples, so that the reliable samples play a leading role in the field adaptation process; the samples with high confidence are defined as reliable samples, because the classification result of the samples with high confidence is more accurate, and the variation condition of the confidence in the field adaptation process is smaller; before traversing the whole target domain data set, calculating a classification probability vector and a pseudo tag for each sample through a target domain model, dividing the samples into sets corresponding to various categories according to the pseudo tag, calculating information entropy of the classification probability vector, selecting low entropy samples in various category sets as high confidence samples as the lower the entropy of the samples is, wherein the set formed by the high confidence samples is a reliable sample set, and the set formed by the rest samples is a weak reliable sample combination.

To better implement the present invention, further, cross-view alignment loss is calculated based on classification probabilities of highly reliable samples and weakly reliable samples; the cross-view alignment loss is realized by sampling a batch of reliable samples from a reliable sample set, and performing two different random data enhancement on the reliable samples, wherein one enhancement mode is called weak enhancement, including random picture overturn, translation and clipping, and the other enhancement mode is called strong enhancement, including random picture overturn, translation, clipping, brightness change, partial occlusion, gaussian blur and the like; the data of the sample obtained by weak enhancement and strong enhancement are respectively called as view 1 and view 2, and the view 1 and the view 2 have natural connection, namely the data have the same category, and the classification probability vectors tend to be consistent; the strong enhancement sample is not necessarily in the vicinity of the weak enhancement sample, but may be scattered at any one of the distributions of the categories to which the strong enhancement sample belongs, so that the cross-view alignment loss can globally consider the distribution of the whole category and is not affected by local noise; minimizing the cross-view alignment loss can lead reliable samples to play a leading role, so that samples scattered everywhere are gathered near the same type of reliable samples, and further, the classification accuracy of the weak reliable samples is improved.

To better implement the present invention, further, cross-view neighbor alignment loss is calculated based on classification probabilities of highly reliable samples and weakly reliable samples; the cross-view neighbor alignment loss is aimed at further enhancing the leadership of reliable samples; the implementation mode is that the neighbors of the reliable samples are searched in the feature queue, and the probability of the neighbors of the reliable samples is very high, so that the neighbors which are not in the data of the current batch can be utilized, and the leading effect of the reliable samples in the field adaptation process is enhanced.

In order to better realize the invention, further, linear weighted neighbor alignment loss, regularization loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss are used for obtaining a final objective function; iteratively back-propagating update target domain model parameters based on the target function; and finally training the obtained target domain model to serve as an image classification model of the label-free target domain data.

In order to better implement the present invention, further, the object domain model, as an image classification model of unlabeled object domain data, includes: the target domain model extracts a classification probability vector of the image to be classified based on the target domain image to be classified, takes a category corresponding to the maximum value in the classification probability vector, and takes the category as a prediction category of the image to be classified; the target domain model is used as an image classification model of unlabeled target domain data, and comprises the following steps: and the target domain model extracts the classification probability vector of the image to be classified based on the target domain image to be classified, takes the category corresponding to the maximum value in the classification probability vector, and takes the category as the prediction category of the image to be classified.

Based on the above-mentioned self-supervision domain adaptive classification method, in order to better realize the invention, further, a self-supervision domain adaptive classification system is provided, which comprises an imaging unit, a data storage unit, a neural network unit and a data processing unit;

the imaging unit is used for acquiring image samples in different fields;

the data storage unit is used for storing image samples in different fields;

the neural network unit comprises a source domain model trained by labeled source domain data and a target domain model trained by unlabeled target domain data;

the data processing unit trains a source domain model based on the active domain data, builds a target domain model based on the source domain model, extracts characteristics and classification probability vectors of the unlabeled target domain data based on the target domain model, calculates neighbor alignment loss, regular loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss based on the characteristics and the classification probability vectors, calculates a final objective function based on all the losses, updates target domain model parameters based on backward propagation of objective function iteration, and uses the target domain model obtained through final training as an image classification model of the unlabeled target domain data.

Based on the above-mentioned non-supervision domain adaptation classification method, in order to better implement the present invention, a device is further provided, which includes a processor and a memory; the memory is used for storing a computer program;

the processor is configured to implement the above-described unsupervised domain adaptation classification method when executing the computer program.

Based on the above-mentioned method for classifying the unsupervised domain adaptation, in order to better implement the present invention, a computer readable storage medium is further provided, on which a computer program is stored, which when executed by a processor, implements the above-mentioned method for classifying the unsupervised domain adaptation.

The invention has the following beneficial effects:

(1) According to the method, the image classification capability of the target domain model is iteratively improved by calculating the neighbor alignment loss, the regular loss, the dispersion loss, the cross-view alignment loss and the cross-view neighbor alignment loss, the problem of misclassification caused by domain offset and class imbalance is solved, and the unsupervised field adaptive image classification is realized.

(2) When encountering data of a new target domain, the invention does not need to re-acquire the source domain data and re-train as in the past method, but can complete training based on the unlabeled target domain data; the training method is suitable for the display requirements of data transmission limitation and data privacy protection at present, and has an important role in practical application for the image classification method.

(3) According to the invention, the classification probability of each sample and the neighbor thereof is kept consistent by calculating the neighbor alignment loss, so that the aggregation degree and the classification accuracy of the similar samples are maintained and improved; the influence of potential noise in the neighbor alignment loss is eliminated by calculating the regular loss, and the certainty degree of the classification probability of each sample is enhanced; by calculating the dispersion loss, the classifier is prevented from simply classifying all samples into a plurality of fixed classes, so that certain classes with a small number of samples are prevented from being completely ignored, and the classification accuracy of the classes conforming to long tail distribution is improved; the cross-view neighbor alignment loss is calculated, further enhancing the leadership of reliable samples.

Drawings

FIG. 1 is a simplified flow diagram of the method of the present invention;

FIG. 2 is a simplified block diagram of the connection of units of the system of the present invention;

FIG. 3 is a simplified flow chart of a source domain model of a preferred embodiment of the present invention;

fig. 4 is a simplified flow chart of a target domain model of a preferred embodiment of the present invention.

Detailed Description

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it should be understood that the described embodiments are only some embodiments of the present invention, but not all embodiments, and therefore should not be considered as limiting the scope of protection. All other embodiments, which are obtained by a worker of ordinary skill in the art without creative efforts, are within the protection scope of the present invention based on the embodiments of the present invention.

In the description of the present invention, it should be noted that, unless explicitly stated and limited otherwise, the terms "disposed," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; or may be directly connected, or may be indirectly connected through an intermediate medium, or may be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.

Example 1:

the embodiment provides an unsupervised field adaptation classification method, which comprises the following steps:

step S1: collecting labeled source domain image data and unlabeled target domain image data, and taking the collected image data as training data;

step S2: establishing a deep convolution network model comprising a feature extractor and a linear classifier, performing image classification on the labeled source domain image data, and training according to training data to obtain a source domain model;

step S3: establishing a target domain model according to the source domain model, and initializing the target domain model by using parameters of the source domain model;

Step S4: extracting features of image data of a target domain, searching for neighbor features of each image data according to the extracted features, and calculating neighbor alignment loss by taking classification probability of neighbors as a supervision signal;

step S5: extracting the classification probability of the target domain image data, calculating regular loss and scattered loss according to the classification probability of the target domain image data, dividing the target domain image data set according to the classification probability, dividing the high-confidence image data into a reliable set, dividing the low-confidence image data into a weak reliable set, and calculating cross-view alignment loss and cross-view neighbor alignment loss according to the classification probability of the high-confidence sample and the low-confidence sample;

step S6: and (3) iteratively updating the target domain model according to the neighbor alignment loss obtained in the step (S4) and the regular loss, the dispersion loss, the cross-view alignment loss and the cross-view neighbor alignment loss obtained in the step (S5), and taking the target domain model obtained through training as an image classification model of the label-free target domain image data.

Working principle: according to the embodiment, a deep convolution network model is built, a source domain model is trained according to the labeled source domain data, a target domain model is built according to the source domain model, the target domain model is trained to execute domain adaptation, unsupervised learning is conducted according to the unlabeled target domain data, characteristics and classification probability of the target domain data are extracted, and neighbor alignment loss, regular loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss are calculated, so that the image classification capability of the target domain model is iteratively improved, and the problem of misclassification caused by domain offset and class imbalance is eliminated, so that unsupervised domain adaptation image classification is realized.

Example 2:

this embodiment describes the steps in embodiment 1 on the basis of embodiment 1 described above.

The extraction modes of extracting the characteristics and the classification probability of the target domain image data in the step 4 and the step 5 are as follows: and establishing a feature queue F and a classification probability vector queue P with fixed capacity, randomly sampling image data in each training iteration of the target domain model, calculating new features and new classification probabilities of the sampled image data, and putting the new features and the new classification probabilities into corresponding positions of the queue according to indexes of the sampled image data.

The specific operation of calculating the neighbor alignment loss in the step S4 is as follows: randomly selecting images in a target domainData sample x _i ^t From image data samples x _i ^t Feature f of (2) _i ^t Searching for a separate data sample x from the feature queue F _i ^t M nearest neighbors of cosine, N _m ⁱ Index set representing storage neighbors, according to target domain data sample x _i ^t Classification probability vector p of (2) _i ^t Weighting coefficient lambda, number of randomly sampled target domain data samples n _b Classification probability vector P of n neighbors of each neighbor in queue P _j 、P _k Calculating neighbor alignment loss L _nc 。

The specific operation of calculating the regular loss in the step S5 is as follows: minimizing randomly selected image data samples x _i ^t Classification probability p of (2) _i ^t And classification probability p _i ^t Classification probability P at queue P _i According to the number n of randomly sampled image data samples _b Minimized target domain data sample x _i ^t Calculate the regular loss L _self 。

The specific operation of calculating the dispersion loss in the step S5 is as follows: calculating randomly selected image data samples x _i ^t Average classification probability vector of (a)

Increase the average classification probability->

Entropy of vector, average classification probability according to entropy increase +.>

Calculate the dispersion loss L _div 。

The specific operation of dividing the target domain image dataset according to the classification probability in the step S5 is as follows: before traversing the whole target domain image data set each time, firstly calculating a classification probability vector and a pseudo tag of each image data according to a target domain model, secondly dividing the image data into sets corresponding to various categories according to the pseudo tag, then calculating information entropy of the classification probability vector, selecting the first r% low-entropy image data in each category set as high-confidence samples, taking a set formed by the high-confidence samples as a reliable sample set, and taking a set formed by the rest samples as a weak reliable sample set.

The specific operation of calculating the cross-view alignment loss in step S5 is: randomly selecting a reliable sample from the reliable sample set, performing weak enhancement and strong enhancement on the reliable sample to obtain view 1 data and view 2 data, and according to a classification probability vector p corresponding to target domain data with index of i _i The classification probability vector of the view 1 data in the queue P, the classification probability vector of the view 2 data in the queue P and the number of the selected reliable samples are calculated to obtain the cross-view alignment loss L _cv 。

The specific operation of calculating the cross-view neighbor alignment loss in the step S5 is as follows: searching the neighbors of the reliable samples in the queue F, strengthening the leading role of the reliable samples in the field adaptation process, and calculating the cross-view neighbor alignment loss L according to the classification probability vector of the view 1 data in the queue P, the classification probability vector of the view 2 in the queue P, the weighting coefficient lambda and the number of the selected reliable samples _cvn 。

The specific operation of step S6 is as follows: and (3) linearly weighting the neighbor alignment loss obtained in the step S4 and the regular loss, the dispersion loss, the cross-view alignment loss and the cross-view neighbor alignment loss obtained in the step S5 to obtain a final objective function, iteratively and reversely propagating and updating the objective domain model parameters according to the final objective function, and taking the trained objective domain model as an image classification model of the label-free objective domain data.

Other portions of this embodiment are the same as those of embodiment 1 described above, and thus will not be described again.

Example 3:

this embodiment is described in detail with reference to one specific embodiment, as shown in fig. 1, 3 and 4, based on any one of embodiments 1 to 2.

As shown in fig. 1, the method comprises the steps 1 to 5:

step 1: acquiring training data, and firstly collecting a batch of labeled source field image data:

represents the i-th data sample,/->

A label representing the ith sample, +.>

The total number of tag space source field data representing source field data is +.>

The method comprises the steps of carrying out a first treatment on the surface of the Collecting a batch of unlabeled target area image data +.>

；

Deep neural networks have been largely successful in various application fields at the cost of time-consuming and labor-intensive large-scale data annotation. However, as the explosiveness of multimedia data grows, manual annotation of all data has become impossible. To avoid high labeling costs, unsupervised domain adaptation methods have been proposed to improve the performance of the model on unlabeled target domain data by utilizing labeled source domain data. However, conventional label-free domain adaptation requires the simultaneous acquisition of source domain and target domain data. In a real scenario, the need to access both source domain data and target domain data may be present. The limitation of access may be due to storage or data privacy limitations, particularly as the amount of multimedia data stored increases, and its transmission is limited by law and privacy policies of the data provider.

Therefore, the embodiment aims at passive domain adaptation, and field adaptation is only performed through the data of the target domain, so that the label-free target domain image classification task is completed. The embodiment eliminates the consumption of storage and transmission of large-scale data, and does not violate data privacy and legal policies. The embodiment is based on two major features of unlabeled target domain data: firstly, whether the target domain data is aligned with the classifier or not, the target domain data forms clusters in a feature space; secondly, the target domain sample with higher confidence is more reliable, and the classification confidence variation is smaller in the domain adaptation process. In addition, the embodiment can process unbalanced class data which is difficult to process by the existing method, namely data with large sample number difference among classes in the same field and large sample number difference among the same class in different fields.

Step 2: constructing a deep convolution network model, and performing an image classification task based on the data of the active label source field; the deep convolution network consists of a feature extractor and a linear classifier; the image classification task objective function is to minimize cross entropy; the model trained is called the source domain model.

The feature extractor of the source domain model is denoted as F _s The classifier is denoted as C _s . Similarly, source Domain sample x ^s Is characterized by

Source domain sample x ^s The classification probability vector of (2) is denoted +.>

Wherein->

Representing a softmax normalization function, d and +.>

Representing the dimensions and the number of categories of the feature, respectively. The training objective function of the source domain model is to minimize cross entropy, and therefore can be calculated from the following formula:

wherein D is _s Representing a source domain dataset, x ^s And y ^s Represents a source domain sample and its label, q represents label y ^s Is a one-hot vector,

is by a factor of->

Vector obtained by smoothing q by labeling, q _k ^ls And p _k ^s Representing vector q ^ls And p ^s Is the kth element of (c).

Since the samples are not evenly distributed in different classes in the real world dataset, but have different degrees of class imbalance, the above-described loss function will inevitably make the model favor classes with large amounts of data, and ignore classes with small amounts of samples. Since the preference for a class can be reflected in the product of the modulo of the feature and the modulo of the classifier weight vector, we propose a simple approach to solve this class imbalance problem.

The specific solution is that the characteristics extracted by the characteristic extractor are subjected to L2 regularization, the parameter vector of the linear classifier is subjected to L2 regularization, the calculated classification probability is not biased to specific classes at this time, class boundary division is more balanced, the influence of the distribution difference of class labels in the same domain is reduced, and the influence of the distribution difference of class labels between the source domain and the target domain is also reduced; taking source domain data as an example, the classification probability formula is defined as follows:

Where τ is the scaling parameter, w _k Is the kth parameter vector of the linear classifier,

representing label y ^s Corresponding parameter vector, ">

Representation->

Is a transpose of (a).

Step 3: constructing a deep convolution network model, called a target domain model, based on the source domain model; initializing a target domain model with parameters of the source domain model, and training the target domain model to perform domain adaptation.

Unlike available non-supervision image classifying method, the present invention is suitable for passive domain adaptation, and the image classifying task of non-label target domain is completed only through the data of target domain. The invention eliminates the consumption of large-scale data storage and transmission, and does not violate data privacy and legal policy. Specifically, the new target domain model is constructed with the pre-trained source domain model as the initialization parameter. Likewise, the invention adopts the characteristic L2 regularization and the classifier parameter vector L2 regularization to process unbalanced class data which is difficult to process by the existing method, namely data with large sample number difference among classes in the same field and large sample number difference among the same class in different fields.

The design of the training objective function of the objective domain model is based on two major characteristics of label-free objective domain data: firstly, whether the target domain data is aligned with the classifier or not, the target domain data forms clusters in a feature space; secondly, the target domain sample with higher confidence is more reliable, and the classification confidence variation is smaller in the domain adaptation process.

Step 4: performing unsupervised learning based on the unlabeled target domain data, and extracting characteristics and classification probability of the target domain data; based on the characteristics of the target domain data, searching for neighbor characteristics for each sample, taking the classification probability of the neighbors as a supervision signal, and calculating neighbor alignment loss to keep the prediction consistency of the neighbors; calculating regular loss and dispersion loss based on the classification probability of the target domain data; dividing a target domain data set based on the classification probability of the target domain data, dividing a high-confidence sample into a reliable set, dividing a low-confidence sample into a weak reliable set, and calculating cross-view alignment loss and cross-view neighbor alignment loss based on the classification probabilities of the high-confidence sample and the low-confidence sample.

Inspired by the above-described cluster structure characteristics of the target domain data, a straightforward way to exploit the cluster structure in high-dimensional space is to consider the inherent consistency between points clustered together, meaning that they are highly probable from the same class. In order to obtain nearest neighbors from the whole dataset and perform small batch random gradient optimization, the invention uses all target features

And the corresponding classification probability vector->

Stored in two queues. The formulation is as follows:

Notably, for the case of

、/>

And->

The index of all target domain samples is the same. At the beginning of each iteration we update the repository to replace the old feature and classification probability vectors with new feature and classification probability vectors calculated for the current small batch of data without any additional computation.

Given a sample of arbitrary target fieldsx _i ^t We use

And->

Cosine similarity between them searches its nearest m neighbors N _m ⁱ (an index set) in which the feature extractor F of the target domain model _t Initialized to Source Domain model F _s . To better exploit the manifold of feature space we also consider supervision of n neighbors from each neighbor, defined as

In summary, the neighbor consistency loss can be formulated as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,

is the number of small batches of data, +.>

Is a classification probability vector of the current small lot size data, < >>

Is a classifier of the target domain model, +.>

And->

Is neighbor in queue->

Is included. λ is a weighting coefficient fixed to 0.1.

To alleviate N _m ⁱ And E is _n ⁱ The regular and dispersion losses are calculated based on the classification probability vector for the influence of potential noise:

Is the average vector of the classification probability vectors of the small lot size data,/->

Representation vector->

Is the kth element of (c). Due to queue->

P in (b) _i And sample x _i ^t Classification probability vector +.>

Numerically identical, regular loss->

The certainty of classification prediction is enhanced, making it close to a one-hot vector. And dispersion loss->

The function of (a) is to avoid all samples being classified into certain specific categoriesThe target domain model is made to treat all classes more fairly.

As previously described, unlabeled target samples with higher classification confidence are more reliable, with less variation in classification confidence during domain adaptation, one of the unlabeled target domain data characteristics. Based on this feature, the present invention proposes a targeted solution to explore and exploit the value of each training period by giving reliable samples to guide the effort in the domain adaptation process, before it begins. Specifically, our method first uses a model trained in the previous stage, and upon initialization, we use the source model to estimate predictions of unlabeled target data. The entire data set is then adaptively divided into a reliable set containing the lowest entropy r% (0 < r < 100) samples in each class and a weakly reliable set containing the remaining samples. This division is critical to estimating the true tag distribution in the target domain.

The reliable sample guidance function is played, and the implementation mode is that a batch of reliable samples are sampled from the reliable sample set, and two different random data enhancement is carried out on the reliable samples. One enhancement mode is called weak enhancement, which comprises random picture turning, translation and clipping, and the other enhancement mode is called strong enhancement, which comprises random picture turning, translation, clipping, brightness change, partial shielding, gaussian blur and the like. The data of the sample after weak enhancement and strong enhancement are called as view 1 and view 2, respectively, and view 1 and view 2 have natural relations, i.e. they should have the same category, and the classification probability vectors should tend to be consistent. Based on this idea, the cross-view alignment loss calculation formula is as follows:

wherein p is _i In accordance with the foregoing definition and in accordance with the present invention,

classification probability vector in queue for view 1, < >>

Classification probability vector representing View 2, n _r Indicating the number of reliable samples of the current batch. Although its formula definition is similar to that of the regular loss, the roles of the two are critically different: the strong enhancement samples are not necessarily in the vicinity of the weak enhancement samples, but may be scattered at any one of the distributions of the categories to which they belong, so that the cross-view alignment loss can globally consider the distribution of the entire category, without being affected by local noise. Minimizing the cross-view alignment loss can lead reliable samples to play a leading role, so that samples scattered everywhere are gathered near the same type of reliable samples, and further, the classification accuracy of the weak reliable samples is improved.

In addition, the invention further enhances the guiding effect of reliable samples, and designs cross-view neighbor alignment loss. In the queue

The neighbor searching method has the advantages that the neighbor searching method is reliable in that the probability of the neighbor searching of the reliable sample is very high, so that the neighbor not in the current batch of data can be utilized, and the leader effect of the reliable sample in the field adaptation process is enhanced; the specific implementation formula is as follows:

/>

wherein all symbols are as defined previously, without any intention.

Step 5: calculating a final objective function based on the neighbor alignment loss, regularization loss, dispersion loss, cross-view alignment loss, and cross-view neighbor alignment loss:

the target domain model is iteratively updated based on an objective function. And finally training the obtained target domain model to serve as an image classification model of the label-free target domain data.

Other portions of this embodiment are the same as any of embodiments 1-2 described above, and thus will not be described again.

Example 4:

this example was verified experimentally on the basis of any one of examples 1 to 3 described above. The present embodiment adopts six mainstream data sets in the domain adaptation domain as training and testing data sets, including three mainstream data sets of passive domain adaptation tasks: office, office-Home and VisDA-C, and unbalanced-like passive domain adaptation tasks. Office-Home (RSUT), visDA-C (RSUT), and DomainNet. Office is a small-scale dataset that contains three areas: amazon (A), webcam (W), DSLR (D), and total 31 types of 4652 pictures. Offic-Home is a medium-scale dataset that contains four fields: artistic (Ar), product (Pr), clipart (Cl) and Real World (Rw) together make up 65 kinds of 15500 pictures. VisDA-C is a challenging large-scale dataset whose source, i.e., composite, and target, i.e., real, images contain 12 classes of 152k and 55k images, respectively.

For a passive domain adaptation task of class imbalance, visDA-C (RSUT) is a class imbalance version of VisDA-C. The severity of a class imbalance in VisDA-C (RSUT) is determined by the imbalance factor

Control, where N _max And N _min The number of samples representing the class with the most samples and the number of samples representing the class with the least samples, respectively. Imbalance factor->

May be 10, 50 or 100. The offset-Home (RSUT) is an unbalanced version of the offset-Home, where the area of artistical is precluded because the artistical image is too few to create an unbalanced subset. DomainNet contains four domains Real (R), clip (C), paint (P), and Sketch (S) and 40 classes. The data distributions of the above three data sets VisDA-C (RSUT), office-Home (RSUT) and DomainNet all have intra-domain class imbalance and inter-domainTag transfer features.

For fair comparison, for passive domain adaptation tasks, office and Office-Home use traditional classification accuracy as an index, while VisDA-C uses average classification accuracy calculated by class; for the class-unbalanced passive domain adaptation task, the three data sets all use the average classification accuracy calculated by class as an index.

Further, the system claimed based on the present embodiment is defined as ICPR in the present embodiment, and the other domain adaptation methods are methods with higher usage. Experimental results as shown in table 1, table 2, table 3, table 4, table 5, table 6, table 7, table 8, the ICPR of this example exhibited sustained and significant improvement over all existing methods. Specifically, ICPR defeated A2Net (73.1% vs. 72.8%) on a moderate scale Office-Home. On a large scale of VisDA-C, ICPR showed a significant improvement over the comparative method, with absolute rise values up to +3.0% compared to NRC and +1.6% compared to shot++. From the comparison of VisDA-C (RSUT) and DomainNet, we can see that the advantages of ICPR remain even if the training data contains severe intra-domain class imbalance and inter-tag offset. For example, ICPR achieves +2.87% and +1.96% average classification accuracy improvement over ISFDA on VisDA-C (RSUT) and DomainNet, respectively.

Comparison in passive domain adaptation and unbalanced-like passive domain adaptation tasks demonstrates the effectiveness of ICPR due to the various consistent alignment strategies and reliable sample mining strategies proposed by the present invention.

Table 1: comparison result 1 of classification accuracy on Office-Home dataset of passive domain adaptation task

Table 2: comparison result 2 of classification accuracy on Office-Home dataset of passive domain adaptation task

TABLE 3 comparison results 1 on the VisDA-C dataset for Passive Domain adaptive mission

TABLE 4 comparison results 2 on the VisDA-C dataset for the Passive Domain adaptive mission

TABLE 5 comparison results 1 on the unbalanced-like passive domain adaptive task VisDA-C (RSUT) dataset

TABLE 6 comparison result 2 on a unbalanced-like passive domain adaptive task VisDA-C (RSUT) dataset

Table 7 comparison result 1 on a task-adaptive DomainNet dataset for unbalanced-like passive domains

Table 8 comparison results 2 on a task-adaptive DomainNet dataset for unbalanced-like passive domains

Other portions of this embodiment are the same as any of embodiments 1 to 3 described above, and thus will not be described again.

Example 5:

this embodiment proposes an unsupervised domain adaptive classification system as shown in fig. 2 on the basis of any one of the above embodiments 1 to 4.

The unsupervised domain adaptation classification system comprises: the imaging unit is used for acquiring image samples in different fields; the data storage unit is used for storing the image samples in the different fields; a neural network unit comprising a source domain model trained with labeled source domain data and a target domain model trained with unlabeled target domain data; the data processing unit is used for training a source domain model based on the active domain data, constructing a target domain model based on the source domain model, extracting characteristics and classification probability vectors of the unlabeled target domain data based on the target domain model, calculating neighbor alignment loss, regular loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss based on the characteristics and the classification probability vectors, calculating a final objective function based on all the losses, updating target domain model parameters based on back propagation of iteration of the objective function, and taking the target domain model obtained through final training as an image classification model of the unlabeled target domain data.

Further, the data processing unit calculates neighbor alignment loss based on the characteristics of the target domain data, calculates regular loss and scatter loss based on the classification probability of the target domain data, calculates cross-view alignment loss and cross-view neighbor alignment loss based on the classification probability of the high-confidence sample and the low-confidence sample, weights the neighbor alignment loss, the regular loss, the scatter loss, the cross-view alignment loss and the cross-view neighbor alignment loss based on a linear weighting mode, and calculates a final objective function for updating the target domain model.

The invention provides an unsupervised field adaptive classification system. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Other portions of this embodiment are the same as any of embodiments 1 to 4 described above, and thus will not be described again.

The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and any simple modification, equivalent variation, etc. of the above embodiment according to the technical matter of the present invention fall within the scope of the present invention.

Claims

1. An unsupervised domain adaptation classification method is characterized by comprising the following steps:

step S4: extracting features of image data of a target domain, searching for neighbor features of each image data according to the extracted features, and calculating neighbor alignment loss by taking classification probability of the neighbor features as a supervision signal;

step S6: iteratively updating the target domain model according to the neighbor alignment loss obtained in the step S4 and the regular loss, the dispersion loss, the cross-view alignment loss and the cross-view neighbor alignment loss obtained in the step S5, and taking the target domain model obtained through training as an image classification model of the label-free target domain image data;

the extraction mode for extracting the characteristics and the classification probability of the target domain image data is as follows: establishing a feature queue F and a classification probability vector queue P with fixed capacity, randomly sampling image data in each training iteration of a target domain model, calculating new features and new classification probabilities of the sampled image data, and putting the new features and the new classification probabilities into corresponding positions of the queue according to indexes of the sampled image data;

The specific operation of calculating the neighbor alignment loss in the step S4 is as follows: randomly selecting image data samples x in a target domain _i ^t Based on image data samplesx _i ^t Feature f of (2) _i ^t Searching for a separate data sample x from the feature queue F _i ^t M nearest neighbors of cosine, N _m ⁱ Index set representing storage neighbors, according to target domain data sample x _i ^t Classification probability vector p of (2) _i ^t Weighting coefficient lambda, number of randomly sampled target domain data samples n _b Classification probability vector P of n neighbors of each neighbor in queue P _j 、P _k Calculating neighbor alignment loss L _nc ；

The specific operation of calculating the regular loss in the step S5 is as follows: minimizing randomly selected image data samples x _i ^t Classification probability p of (2) _i ^t And classification probability p _i ^t Classification probability P at queue P _i According to the number n of randomly sampled image data samples _b Minimized target domain data sample x _i ^t Calculate the regular loss L _self ；

Increase the average classification probability->

Calculate the dispersion loss L _div ；

2. The method of unsupervised domain adaptive classification as claimed in claim 1, wherein the specific operation of calculating the cross-view alignment loss in step S5 is: randomly selecting a reliable sample from the reliable sample set, performing weak enhancement and strong enhancement on the reliable sample to obtain view 1 data and view 2 data, and according to a classification probability vector p corresponding to target domain data with index of i _i The classification probability vector of the view 1 data in the queue P, the classification probability vector of the view 2 data in the queue P and the number of the selected reliable samples are calculated to obtain the cross-view alignment loss L _cv 。

3. The unsupervised domain adaptive classification method according to claim 1, wherein the specific operation of calculating the cross-view neighbor alignment loss in step S5 is as follows: searching the neighbors of the reliable samples in the queue F, strengthening the leading role of the reliable samples in the field adaptation process, and calculating the cross-view neighbor alignment loss L according to the classification probability vector of the view 1 data in the queue P, the classification probability vector of the view 2 in the queue P, the weighting coefficient lambda and the number of the selected reliable samples _cvn 。

4. The method of unsupervised domain adaptive classification as claimed in claim 1, wherein the specific operation of step S6 is as follows: and (3) linearly weighting the neighbor alignment loss obtained in the step S4 and the regular loss, the dispersion loss, the cross-view alignment loss and the cross-view neighbor alignment loss obtained in the step S5 to obtain a final objective function, updating objective domain model parameters according to the back propagation of the final objective function iteration, and taking the objective domain model obtained through training as an image classification model of the label-free objective domain data.

5. An unsupervised domain adaptive classification method as claimed in claim 1, wherein the features extracted by the feature extractor in the step S2 are subjected to L2 regularization, and the parameter vectors of the linear classifier are subjected to L2 regularization.

6. An unsupervised field adaptation classification system is characterized by comprising an imaging unit, a data storage unit, a neural network unit and a data processing unit;

the imaging unit is used for acquiring image samples in different fields;

the data storage unit is used for storing image samples in different fields;

the data processing unit trains a source domain model based on the active domain data, builds a target domain model based on the source domain model, extracts characteristics and classification probability vectors of the unlabeled target domain data based on the target domain model, calculates neighbor alignment loss, regular loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss based on the characteristics and the classification probability vectors, calculates a final objective function based on all the losses, iteratively and reversely propagates update target domain model parameters based on the objective function, and uses the target domain model obtained through final training as an image classification model of the unlabeled target domain data.

7. An electronic device, comprising a processor and a memory; the memory is used for storing a computer program;

the processor, when executing the computer program, is configured to implement the unsupervised domain adaptive classification method according to any of claims 1-5.

8. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the unsupervised domain adaptation classification method according to any of the claims 1-5.