CN115546567A

CN115546567A - Unsupervised field adaptive classification method, system, equipment and storage medium

Info

Publication number: CN115546567A
Application number: CN202211528066.9A
Authority: CN
Inventors: 徐行; 田加林; 沈复民; 申恒涛
Original assignee: Chengdu Koala Youran Technology Co ltd
Current assignee: Chengdu Koala Youran Technology Co ltd
Priority date: 2022-12-01
Filing date: 2022-12-01
Publication date: 2022-12-30
Anticipated expiration: 2042-12-01
Also published as: CN115546567B

Abstract

The invention relates to the field of deep learning, in particular to an unsupervised field adaptive classification method, a system, equipment and a storage medium; the method comprises the steps of building a deep convolutional network model, training a source domain model according to labeled source domain data, building a target domain model according to the source domain model, training the target domain model to execute domain adaptation, performing unsupervised learning according to unlabeled target domain data, extracting characteristics and classification probability of the target domain data, iteratively improving the image classification capability of the target domain model by calculating neighbor alignment loss, regular loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss, and eliminating the problem of misclassification caused by domain deviation and class imbalance, thereby realizing unsupervised domain adaptive image classification.

Description

Unsupervised field adaptive classification method, system, equipment and storage medium

Technical Field

The invention relates to the field of deep learning, in particular to an unsupervised field adaptive classification method, system, equipment and storage medium.

Background

Deep neural networks have become the benchmark model for a variety of tasks at the expense of time-consuming and labor-intensive large-scale data annotation. However, with the breakthrough development of digital devices and online applications, manual labeling of ever-expanding multimedia data has become impossible. To avoid high labeling costs, unsupervised domain adaptation has been developed to improve the performance of the model on unlabeled target domain data using previously labeled source domain data.

Mainstream unsupervised domain adaptation methods address learning domain invariant features through moment matching or countertraining. Although unsupervised domain adaptation has recently made promising progress, in real-world scenarios, the assumption of simultaneous access to source domain data and target domain data may not hold. The access limitations may be due to storage limitations and data privacy limitations, especially as the amount of multimedia data stored is increasing, and its transmission is limited by law and privacy policies of the data providers. Accordingly, the present document is directed to passive domain adaptation, where only the data of the target domain is used to adapt the pre-trained source domain model. The passive domain adaptation method does not need to access the source domain data after the source model training is completed, which eliminates the storage and transmission consumption of large-scale data and does not violate data constraints. However, the existing passive domain adaptation does not consider two major characteristics of the non-tag target domain data: firstly, no matter whether the target domain data is aligned with the classifier or not, the target domain data forms clusters in the feature space; secondly, the target domain samples with higher confidence are more reliable, and the classification confidence of the target domain samples changes less in the field adaptation process. In addition, the existing passive domain adaptation method is difficult to process data with class imbalance, namely, data with large sample number difference between classes in the same field and data with large sample number difference between the same fields are difficult to process. For example, source domain class 1 and class 2 have 1000 and 10 samples, respectively, while class 1 and class 2 have 10 and 1000 samples, respectively, in the target domain.

In summary, the existing domain adaptive classification method does not consider two major characteristics of the non-label target domain data, and is difficult to process data with unbalanced class.

Disclosure of Invention

The invention provides an unsupervised field adaptive classification method, system, equipment and storage medium aiming at the problems of unreasonable application setting and difficulty in handling class imbalance data of the existing unsupervised field adaptive method.

The specific implementation content of the invention is as follows:

an unsupervised domain adaptive classification method comprises the following steps:

step S1: acquiring training data, collecting a batch of labeled source field image data, and collecting a batch of unlabeled target field image data;

step S2: constructing a deep convolution network model, and performing an image classification task based on the active label field data, wherein the deep convolution network is composed of a feature extractor and a linear classifier; the image classification task objective function is the minimum cross entropy, and a model obtained by training is called a source domain model;

and step S3: constructing a deep convolution network model, called a target domain model, based on the source domain model, initializing the target domain model by using parameters of the source domain model, and training the target domain model to execute domain adaptation;

and step S4: performing unsupervised learning based on unlabeled target domain data, extracting features and classification probabilities of the target domain data, calculating neighbor alignment loss based on the features of the target domain data, calculating regular loss and dispersion loss based on the classification probabilities of the target domain data, dividing a target domain data set based on the classification probabilities of the target domain data, dividing high-confidence-degree samples into reliable sets, dividing low-confidence-degree samples into weak-reliable sets, and calculating cross-view alignment loss and cross-view neighbor alignment loss based on the classification probabilities of the high-confidence-degree samples and the low-confidence-degree samples;

step S5: iteratively updating the target domain model based on the neighbor alignment loss, the regular loss, the dispersion loss, the cross-view alignment loss and the cross-view neighbor alignment loss, and training the obtained target domain model to serve as an image classification model of the label-free target domain data.

In order to better realize the method, the constructed source domain convolution network and the target domain convolution network are composed of a feature extractor and a linear classifier; the features extracted by the feature extractor need to be subjected to L2 regularization, the parameter vectors of the linear classifier need to be subjected to L2 regularization, the classification probability obtained by calculation cannot be biased to a specific class at the moment, class boundary division is more balanced, the influence of class label distribution difference in the same domain is reduced, and the influence of class label distribution difference between the source domain and the target domain is also reduced.

In order to better realize the invention, further, based on the source domain model, a deep convolution network is constructed as a target domain model, the target domain model is initialized by using parameters of the source domain model, and a target domain is trained to execute domain adaptation; different from the conventional unsupervised field adaptive image classification method, the source domain model and the target domain model are obtained by training with training data in different fields, and the source domain model is only used as an initialization model of the target domain model; in the past, the method needs to acquire the labeled source domain data and the unlabeled target domain data at the same time and train a target domain model based on the two data.

In order to better realize the invention, furthermore, unsupervised learning is carried out based on the non-label target domain data, the characteristics and the classification probability of the target domain data are extracted, and two characteristic queues and classification probability vector queues with fixed capacity are established; the two queues are updated in such a way that a batch of data is randomly sampled in each training iteration of the target domain model, new features and classification probabilities are calculated for the batch of data, and the data are placed in corresponding positions of the queues according to indexes of the batch of data.

In order to better realize the invention, further, based on the characteristics of the target domain data, the neighbor characteristics of each sample are searched, the classification probability of the neighbor is taken as a supervision signal, and the neighbor alignment loss is calculated; the neighbor alignment loss aims to keep the classification probability of each sample and the neighbor thereof consistent by utilizing the phenomenon that similar samples tend to be aggregated into clusters in a feature space so as to maintain and improve the aggregation degree and the classification accuracy of the similar samples.

In order to better implement the present invention, further, a regularization loss is calculated based on the classification probability of the target domain data; the regularization loss aims to eliminate the influence of potential noise in the neighbor alignment loss and strengthen the certainty degree of the classification probability of each sample.

To better implement the present invention, further, a dispersion loss is calculated based on the classification probability of the target domain data; the purpose of the dispersion loss is to prevent the classifier from simply classifying all samples into fixed classes, avoid certain classes with few samples from being completely ignored, and contribute to improving the classification accuracy of the classes conforming to the long-tail distribution.

In order to better realize the invention, further, a target domain data set is divided based on the classification probability of the target domain data, high confidence coefficient samples are divided into reliable sets, and low confidence coefficient samples are divided into weak reliable sets; the reliable samples and the weak reliable samples are divided into sets, and the method aims to explore the value of the reliable samples and enable the reliable samples to play a leading role in the field adaptation process; defining the samples with high confidence as reliable samples, wherein the classification result of the samples with high confidence is more accurate, and the variation condition of the confidence in the field adaptation process is smaller; before traversing the whole target domain data set every time, firstly calculating a classification probability vector and a pseudo label for each sample through a target domain model, then dividing the samples into sets corresponding to various categories according to the pseudo labels, then calculating the information entropy of the classification probability vector, selecting low-entropy samples in various category sets as high-confidence samples when the entropy of the samples is lower and the classification confidence is higher, wherein the sets formed by the high-confidence samples are reliable sample sets, and the sets formed by the rest samples are weak-confidence sample combinations.

To better implement the present invention, further, cross-view alignment loss is calculated based on classification probabilities of highly reliable samples and weakly reliable samples; the cross-view alignment loss is realized by sampling a batch of reliable samples from a reliable sample set and performing two different random data enhancements on the reliable samples, wherein one enhancement mode is called weak enhancement and comprises random image turning, translation and clipping, and the other enhancement mode is called strong enhancement and comprises random image turning, translation, clipping, brightness change, partial shielding, gaussian blur and the like; the data obtained by the sample after weak enhancement and strong enhancement are respectively called as view 1 and view 2, wherein view 1 and view 2 have natural connection, namely, they should have the same category, and the classification probability vectors should tend to be consistent; the strongly enhanced samples are not necessarily in the vicinity of the weakly enhanced samples, but may be scattered anywhere in the distribution of the class to which they belong, so the cross-view alignment loss can globally consider the distribution of the entire class, unaffected by local noise; the minimized cross-view alignment loss can lead reliable samples to play a leading role, so that the samples scattered everywhere are gathered near the same type of reliable samples, and the classification accuracy of the weak reliable samples is improved.

To better implement the present invention, further, cross-view neighbor alignment loss is calculated based on the classification probabilities of the highly reliable samples and the weakly reliable samples; the cross-view neighbor alignment loss aims to further strengthen the leadership effect of reliable samples; the method is realized by searching the neighbor of the reliable sample in the characteristic queue, and the neighbor of the reliable sample has a very high probability to be reliable, so that the neighbor which is not in the data of the current batch can be utilized, and the leadership of the reliable sample in the field adaptation process is strengthened.

In order to better realize the method, further, linear weighting neighbor alignment loss, regularization loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss are obtained to obtain a final objective function; iteratively back-propagating and updating target domain model parameters based on the target function; and finally, training to obtain a target domain model serving as an image classification model of the label-free target domain data.

To better implement the present invention, further, the target domain model as an image classification model of the unlabeled target domain data includes: the target domain model extracts a classification probability vector of the image to be classified based on the target domain image to be classified, a category corresponding to the maximum value in the classification probability vector is selected, and the category is used as a prediction category of the image to be classified; the target domain model is used as an image classification model of the label-free target domain data, and comprises the following steps: the target domain model extracts a classification probability vector of the image to be classified based on the target domain image to be classified, and takes the category corresponding to the maximum value in the classification probability vector as the prediction category of the image to be classified.

Based on the unsupervised field adaptive classification method, in order to better realize the invention, further, an unsupervised field adaptive classification system is provided, which comprises an imaging unit, a data storage unit, a neural network unit and a data processing unit;

the imaging unit is used for acquiring image samples in different fields;

the data storage unit is used for storing the image samples in different fields;

the neural network unit comprises a source domain model trained by the labeled source domain data and a target domain model trained by the unlabeled target domain data;

the data processing unit trains a source domain model based on the labeled source domain data, constructs a target domain model based on the source domain model, extracts features and classification probability vectors of label-free target domain data based on the target domain model, calculates neighbor alignment loss, regularization loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss based on the features and the classification probability vectors, calculates a final target function based on all the losses, updates target domain model parameters based on back propagation of target function iteration, and takes the target domain model obtained based on final training as an image classification model of the label-free target domain data.

Based on the above proposed unsupervised domain adaptive classification method, in order to better implement the present invention, further, an apparatus is proposed, which comprises a processor and a memory; the memory for storing a computer program;

the processor is configured to implement the unsupervised domain adaptive classification method when executing the computer program.

Based on the above-mentioned unsupervised domain adaptive classification method, in order to better implement the present invention, further, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, implements the above-mentioned unsupervised domain adaptive classification method.

The invention has the following beneficial effects:

(1) According to the method, the image classification capability of the target domain model is iteratively improved by calculating the neighbor alignment loss, the regular loss, the dispersion loss, the cross-view alignment loss and the cross-view neighbor alignment loss, the problem of misclassification caused by domain deviation and class imbalance is solved, and the unsupervised field adaptive image classification is realized.

(2) When encountering the data of a new target domain, the invention does not need to re-acquire the data of the source domain and then re-train as the prior method, but can complete the training based on the data of the label-free target domain; the training method is suitable for the display requirements of data transmission limitation and data privacy protection at present, and plays an important role in the aspect of practical application of the image classification method.

(3) The invention keeps the classification probability of each sample and the neighbor thereof consistent by calculating the neighbor alignment loss so as to maintain and improve the aggregation degree and the classification accuracy of the similar samples; by calculating the regular loss, the influence of potential noise in the neighbor alignment loss is eliminated, and the confidence degree of the classification probability of each sample is strengthened; by calculating the dispersion loss, the classifier is prevented from simply classifying all samples into fixed classes, certain classes with few samples are prevented from being completely ignored, and the classification accuracy rate of the classes conforming to the long-tail distribution is improved; and calculating the cross-view neighbor alignment loss, and further strengthening the leadership effect of the reliable samples.

Drawings

FIG. 1 is a simplified flow diagram of the method of the present invention;

FIG. 2 is a simplified block diagram of the connection of the elements of the system of the present invention;

FIG. 3 is a simplified flow diagram of a source domain model of a preferred embodiment of the present invention;

FIG. 4 is a simplified flowchart of a target domain model of a preferred embodiment of the present invention.

Detailed Description

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments, and therefore should not be considered as limiting the scope of protection. All other embodiments, which can be obtained by a worker skilled in the art based on the embodiments of the present invention without making creative efforts, shall fall within the protection scope of the present invention.

In the description of the present invention, it is to be noted that, unless otherwise explicitly specified or limited, the terms "disposed," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in a specific case to those of ordinary skill in the art.

Example 1:

the embodiment provides an unsupervised field adaptive classification method, which comprises the following steps:

step S1: collecting image data of a labeled source domain and image data of an unlabeled target domain, and taking the collected image data as training data;

step S2: establishing a deep convolution network model comprising a feature extractor and a linear classifier, carrying out image classification on the source domain image data with the labels, and training according to training data to obtain a source domain model;

and step S3: establishing a target domain model according to the source domain model, and initializing the target domain model by using parameters of the source domain model;

and step S4: extracting the features of the image data of the target domain, searching the neighbor features of each image data according to the extracted features, and calculating the neighbor alignment loss by taking the classification probability of the neighbors as a supervision signal;

step S5: extracting the classification probability of the target domain image data, calculating the regularization loss and the dispersion loss according to the classification probability of the target domain image data, dividing a target domain image data set according to the classification probability, dividing high-confidence image data into a reliable set, dividing low-confidence image data into a weak reliable set, and calculating the cross-view alignment loss and the cross-view neighbor alignment loss according to the classification probabilities of high-confidence samples and low-confidence samples;

step S6: and (5) according to the neighbor alignment loss obtained in the step (S4) and the regular loss, the dispersion loss, the cross-view alignment loss and the cross-view neighbor alignment loss obtained in the step (S5), iteratively updating the target domain model, and taking the trained target domain model as an image classification model of the image data of the label-free target domain.

The working principle is as follows: in the embodiment, a deep convolutional network model is constructed, a source domain model is trained according to labeled source domain data, a target domain model is constructed according to the source domain model, the target domain model is trained to execute domain adaptation, unsupervised learning is performed according to unlabeled target domain data, the characteristics and classification probability of the target domain data are extracted, the image classification capability of the target domain model is iteratively improved by calculating neighbor alignment loss, regular loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss, and the problem of misclassification caused by domain offset and class imbalance is solved, so that unsupervised domain adaptive image classification is realized.

Example 2:

this example describes the procedure of example 1, in addition to example 1 described above.

The extraction method for extracting the features and the classification probability of the target domain image data in the step 4 and the step 5 is as follows: establishing a feature queue F and a classification probability vector queue P with fixed capacity, randomly sampling image data in each training iteration of a target domain model, calculating new features and new classification probabilities of the sampled image data, and putting the new features and the new classification probabilities into corresponding positions of the queues according to indexes of the sampled image data.

The specific operation of calculating the neighbor alignment loss in the step S4 is as follows: random selection of image data samples x in a target domain _i ^t From image data samples x _i ^t Characteristic f of _i ^t Searching the feature queue F for the data sample x _i ^t The cosine of (1) is nearest m neighbors by N _m ⁱ Index set representing storage neighbors, based on target domain data samples x _i ^t Is classified into probability vectors p _i ^t A weighting factor lambda, a number n of randomly sampled target domain data samples _b N neighbors of each neighbor, in a queue P _j 、P _k Calculating neighbor alignment loss L _nc 。

The specific operation of calculating the regularization loss in step S5 is: minimizing randomly selected image data samples x _i ^t Is classified into a probability p _i ^t And classification probability p _i ^t Classification probability P of being in queue P _i According to the number n of randomly sampled image data samples _b Minimized target domain data sample x _i ^t Calculating the regular loss L _self 。

The specific operation of calculating the dispersion loss in step S5 is: computing randomly selected image data samples x _i ^t Average classification probability vector of

Increasing the mean classification probability

Entropy of vector, according to average classification probability after entropy increase

Calculating the dispersion loss L _div 。

The specific operation of dividing the target domain image data set according to the classification probability in the step S5 is: before traversing the whole target domain image data set every time, firstly calculating a classification probability vector and a pseudo label of each image data according to a target domain model, secondly dividing the image data into sets corresponding to various categories according to the pseudo labels, then calculating the information entropy of the classification probability vector, selecting the first r% low-entropy image data in each category set as high-confidence-degree samples, taking a set formed by the high-confidence-degree samples as a reliable sample set, and taking a set formed by the rest samples as a weak-reliability sample set.

The specific operation of calculating the cross-view alignment loss in step S5 is: randomly selecting reliable samples from the reliable sample set, carrying out weak enhancement and strong enhancement on the reliable samples to obtain view 1 data and view 2 data, and carrying out classification probability vector p corresponding to target domain data with index i _i Calculating the classification probability vector of the view 1 data in the queue P, the classification probability vector of the view 2 in the queue P and the number of selected reliable samples to obtain the cross-view alignment loss L _cv 。

The specific operation of calculating the cross-view neighbor alignment loss in the step S5 is as follows: searching for neighbors of reliable samples in the queue F, strengthening the leadership of the reliable samples in the field adaptation process, and calculating the cross-view neighbor alignment loss L according to the classification probability vector of the view 1 data in the queue P, the classification probability vector of the view 2 in the queue P, the weighting coefficient lambda and the number of the selected reliable samples _cvn 。

The specific operation of step S6 is: and (5) linearly weighting the neighbor alignment loss obtained in the step (S4) and the regular loss, the dispersion loss, the cross-view alignment loss and the cross-view neighbor alignment loss obtained in the step (S5) to obtain a final objective function, iteratively and reversely propagating and updating parameters of the target domain model according to the final objective function, and taking the trained target domain model as an image classification model of the label-free target domain data.

Other parts of this embodiment are the same as those of embodiment 1, and thus are not described again.

Example 3:

this embodiment will be described in detail with reference to a specific embodiment on the basis of any one of the embodiments 1 to 2, as shown in fig. 1, 3 and 4.

As shown in fig. 1, the method comprises steps 1 to 5:

step 1: acquiring training data, namely firstly collecting a batch of labeled source field image data:

which represents the ith data sample,

a label representing the ith sample is attached to the sample,

the total number of source domain data in the label space representing the source domain data is

(ii) a Collecting a batch of label-free target domain image data

；

Deep neural networks have been successful in various application areas at the cost of time-consuming and labor-intensive large-scale data annotation. However, with the explosive growth of multimedia data, manual tagging of all data has become impossible. To avoid high labeling costs, unsupervised domain adaptation methods are proposed to improve the performance of the model on unlabeled target domain data by utilizing labeled source domain data. However, conventional label-free domain adaptation requires simultaneous acquisition of source domain and target domain data. In a real scenario, the need to access both source domain data and target domain data is possible. The access limitations may be due to storage or data privacy limitations, especially as the amount of multimedia data stored increases, the transmission of which is limited by law and by data providers' privacy policies.

Therefore, the embodiment is dedicated to passive domain adaptation, and only the data of the target domain is used for domain adaptation, so that the task of image classification of the label-free target domain is completed. The embodiment eliminates the storage and transmission consumption of large-scale data, and does not violate data privacy and legal policies. The embodiment is based on two characteristics of label-free target domain data: firstly, no matter whether the target domain data is aligned with the classifier or not, the target domain data forms clusters in the feature space; secondly, the target domain samples with higher confidence are more reliable, and the classification confidence of the target domain samples changes less in the field adaptation process. In addition, the embodiment can process the data with unbalanced class, which is difficult to process by the existing method, namely the data with large difference of the number of samples in the same field and the data with large difference of the number of samples in the same field in different fields.

Step 2: constructing a deep convolutional network model, and performing an image classification task based on the active label field data; the deep convolutional network consists of a feature extractor and a linear classifier; the image classification task objective function is a minimum cross entropy; the trained model is called a source domain model.

The feature extractor of the source domain model is denoted as F _s The classifier is marked as C _s . For the same reason, source domain sample x ^s Is characterized by

Source ofField sample x ^s Is recorded as a classification probability vector

Wherein

Denotes the softmax normalization function, d and

representing the dimensions and number of classes of features, respectively. The training objective function of the source domain model is to minimize the cross entropy, and therefore, can be calculated by the following formula:

wherein D is _s Representing a source-domain data set, x ^s And y ^s Representing source domain exemplars and their labels, q represents label y ^s The one-hot vector of (a),

is a coefficient of

The vector q obtained after tag smoothing is performed on q, q _k ^ls And p _k ^s Representing a vector q ^ls And p ^s The kth element of (1).

Since in real-world datasets the samples are not evenly distributed in different classes, but have different degrees of class imbalance, the above-mentioned penalty function will inevitably cause the model to favor classes with a large amount of data, and ignore classes with a small number of samples. Since the preference for a class can be reflected in the product of the modulus of the feature and the modulus of the classifier weight vector, we propose a simple method to solve this class imbalance problem.

The specific solution is that the features extracted by the feature extractor are subjected to L2 regularization, the parameter vectors of the linear classifier are subjected to L2 regularization, the classification probability obtained by calculation is not biased to a specific class at the moment, class boundary division is more balanced, the influence of class label distribution difference in the same domain is reduced, and the influence of class label distribution difference between a source domain and a target domain is also reduced; taking source domain data as an example, the classification probability formula is defined as follows:

where τ is a scaling parameter, w _k Is the kth parameter vector of the linear classifier,

indicating label y ^s The corresponding vector of parameters is then used to generate,

to represent

The transposing of (1).

And step 3: constructing a deep convolution network model based on the source domain model, wherein the deep convolution network model is called a target domain model; and initializing the target domain model by using the parameters of the source domain model, and training the target domain model to execute domain adaptation.

Different from the prior unsupervised field adaptive image classification method, the invention is dedicated to passive field adaptation, and only uses the data of the target field to carry out field adaptation so as to complete the image classification task of the non-label target field. The invention eliminates the storage and transmission consumption of large-scale data, and does not violate data privacy and law policy. Specifically, the constructed new target domain model takes the pre-trained source domain model as an initialization parameter. Similarly, the invention adopts characteristic L2 regularization and classifier parameter vector L2 regularization to process class imbalance data which is difficult to process by the existing method, namely data with large sample number difference among classes in the same field and data with large sample number difference among the same fields.

The design of the training target function of the target domain model is based on two characteristics of label-free target domain data: firstly, no matter whether the target domain data is aligned with the classifier or not, the target domain data can form clusters in the feature space; secondly, the target domain samples with higher confidence are more reliable, and the classification confidence of the target domain samples changes less in the field adaptation process.

And 4, step 4: performing unsupervised learning based on the non-label target domain data, and extracting the characteristics and classification probability of the target domain data; based on the characteristics of the target domain data, searching neighbor characteristics for each sample, taking the classification probability of neighbor as a supervision signal, and calculating neighbor alignment loss to keep the prediction of neighbor consistent; calculating regularization loss and dispersion loss based on the classification probability of the target domain data; dividing a target domain data set based on the classification probability of the target domain data, dividing high-confidence-degree samples into reliable sets, dividing low-confidence-degree samples into weak reliable sets, and calculating cross-view alignment loss and cross-view neighbor alignment loss based on the classification probability of the high-confidence-degree samples and the low-confidence-degree samples.

Inspired by the clustering characteristics of the target domain data described above, a straightforward approach to exploit clustering in high-dimensional space is to consider the inherent consistency between points clustered together, which means that they come from the same class with a high probability. In order to obtain the nearest neighbors from the whole dataset and perform small-batch stochastic gradient optimization, the invention applies all target features

And corresponding classification probability vector

Stored in two queues. The formula is expressed as follows:

it is worth noting that for

、

And

the index of all target domain samples is the same. At the start of each iteration we update these repositories to replace the old feature and classification probability vectors with the new ones computed for the current small batch of data, without any additional computation.

Sample x given an arbitrary target field _i ^t We use

And

cosine similarity between the N neighbor search methods searches m nearest neighbor N neighbor _m ⁱ (an index set) in which feature extractor F of the target domain model _t Is initialized to a source domain model F _s . To make better use of the manifold of the feature space, we also consider the supervision of n neighbors from each neighbor, defined as

In summary, the neighbor consensus loss can be formulated as follows:

wherein the content of the first and second substances,

is the number of small batches of data,

is the classification probability vector for the current small batch of data,

is a classifier of the model of the target domain,

and

is a neighbor in queue

The classification probability vector of (1). λ is a weighting coefficient which is fixedly set to 0.1.

To alleviate N _m ⁱ And E _n ⁱ And (3) calculating the regularization loss and the dispersion loss based on the classification probability vector:

wherein the content of the first and second substances,

is the average vector of the classification probability vectors for small batches of data,

representing a vector

The kth element of (1). Due to the queue

P in (1) _i And sample x _i ^t Is classified into probability vectors

Same in value, regular loss

The certainty of classification prediction is strengthened, and the classification prediction is close to a one-hot vector. And dispersion loss

The effect of (a) is to avoid all samples being classified into certain specific categories, so that the target domain model treats all categories more fairly.

As mentioned above, one of the characteristics of the unlabeled target domain data is that the unlabeled target sample with higher classification confidence is more reliable and the change of the classification confidence is smaller in the field adaptation process. Based on this feature, before each training cycle begins, the present invention proposes a targeted solution to exploit and exploit their value by giving reliable samples to guide effects in the field adaptation process. Specifically, our method first uses the model trained in the previous stage, and at initialization, we use the source model to estimate the prediction of unlabeled target data. The entire data set is then adaptively divided into reliable and weakly reliable sets, where reliable sets contain the lowest entropy r% (0 and r and 100) samples in each category, and weakly reliable sets contain the remaining samples. This partitioning is crucial to estimate the true label distribution in the target domain.

The method plays a guiding role of reliable samples, and is realized by sampling a batch of reliable samples from a reliable sample set and performing two different random data enhancements on the reliable samples. One enhancement mode is called weak enhancement and comprises random image turning, translation and clipping, and the other enhancement mode is called strong enhancement and comprises random image turning, translation, clipping, brightness change, partial shielding, gaussian blur and the like. The data obtained by the sample after being weakly enhanced and strongly enhanced are respectively called view 1 and view 2, and the view 1 and the view 2 have natural relation, namely, they should have the same category, and the classification probability vectors should be consistent. Based on this idea, the cross-view alignment loss calculation formula is as follows:

wherein p is _i The definition is in accordance with the foregoing,

to view the classification probability vectors in the queue of figure 1,

class probability vector, n, representing View 2 _r Representing the number of reliable samples of the current batch. Although the formula definition is similar to that of the regular loss, the two have different key functions: the strongly enhanced samples are not necessarily in the vicinity of the weakly enhanced samples, but may be scattered anywhere in the distribution of the class to which they belong, so the cross-view alignment penalty can globally take into account the distribution of the entire class, unaffected by local noise. The cross-view alignment loss is minimized, so that reliable samples can play a leading role, the samples scattered everywhere are gathered near the reliable samples of the same type, and the classification accuracy of the weak reliable samples is improved.

In addition, the invention further strengthens the guiding function of reliable samples and designs the cross-view neighbor alignment loss. In queue

The neighbors of the reliable samples are searched, and the neighbors of the reliable samples have very high probability of being reliable, so that the neighbors which are not in the data of the current batch can be utilized, and the leadership function of the reliable samples in the field adaptation process is strengthened; the concrete implementation formula is as follows:

wherein all symbols are as defined above, without any ambiguity.

And 5: calculating a final objective function based on the neighbor alignment loss, the regularization loss, the dispersion loss, the cross-view alignment loss, and the cross-view neighbor alignment loss:

iteratively updating the target domain model based on an objective function. And finally, training to obtain a target domain model serving as an image classification model of the label-free target domain data.

Other parts of this embodiment are the same as any of embodiments 1-2, and thus are not described again.

Example 4:

this example was experimentally verified on the basis of any of examples 1 to 3. In this embodiment, six mainstream data sets in the field adaptation field are used as training and testing data sets, and the three mainstream data sets include three mainstream data sets in the passive field adaptation task: office, office-Home and VisDA-C, and three mainstream data sets of class imbalance passive domain adaptation tasks: office-Home (RSUT), visDA-C (RSUT), and DomainNet. Office is a small-scale data set, comprising three fields: amazon (a), webcam (W), DSLR (D), total 31 types of 4652 pictures. Offic-Home is a medium-scale data set, and comprises four fields: artistic (Ar), product (Pr), clipart (Cl) and Real World (Rw), for a total of 65 types of 15500 pictures. VisDA-C is a challenging large-scale dataset whose source domain, i.e. composite image, and target domain, i.e. real image, contain 12 classes of 152k and 55k images, respectively.

For passive domain adaptation tasks that are class imbalanced, visDA-C (RSUT) is a class imbalance version of VisDA-C. The severity of class imbalance in VisDA-C (RSUT) is determined by the imbalance factor

Control wherein N _max And N _min Number of samples representing class with most samples and class with least samples, respectivelyThe number of samples of (2). Coefficient of unbalance

May be 10, 50 or 100.Offic-Home (RSUT) is an unbalanced version of Offic-Home, where the Artistic fields are excluded because Artistic images are too few to create an unbalanced subset. The DomainNet contains four fields of Real (R), clipart (C), painting (P) and Sketch (S) and 40 classes. The data distribution of the three data sets VisDA-C (RSUT), office-Home (RSUT) and DomainNet has the characteristics of intra-domain class imbalance and inter-domain label transfer.

For fair comparison, for passive field adaptive tasks, office and Office-Home use traditional classification accuracy as an index, and VisDA-C uses average classification accuracy calculated by class; for passive domain adaptation tasks with class imbalance, the average classification accuracy calculated by class is used as an index for all three data sets.

Further, the system claimed based on this embodiment is defined as ICPR in this embodiment, and the other field adaptation methods are all methods with higher degree of use. Experimental results as shown in tables 1, 2, 3, 4, 5, 6, 7, 8, the present example ICPR exhibited a sustained and significant improvement over all prior methods. Specifically, ICPR defeated A2Net (73.1% vs. 72.8%) on a medium-scale Office-Home. On a large scale of VisDA-C, the ICPR showed a significant improvement over the comparative process, up to +3.0% absolute increase compared to NRC and +1.6% absolute increase compared to SHOT + +. From the results of comparison between VisDA-C (RSUT) and DomainNet, we can see that the superiority of ICPR is still present even though the training data contains severe intra-domain class imbalance and inter-label bias. For example, ICPR achieved +2.87% and +1.96% mean classification accuracy improvement over the ISFDA on VisDA-C (RSUT) and DomainNet, respectively.

Comparison between passive domain adaptation and unbalanced-like passive domain adaptation tasks demonstrates the effectiveness of ICPR, which is attributed to the various consistency alignment strategies and reliable sample mining strategies proposed by the present invention.

Table 1: comparison result 1 of classification accuracy on Office-Home data set of task adaptation in passive field

Table 2: comparison result of classification accuracy on Office-Home data set of task adaptation in passive field 2

TABLE 3 comparison of results on a VisDA-C dataset for a Passive Domain Adaptation task 1

TABLE 4 comparison results on VisDA-C dataset for Passive Domain adaptive tasks 2

TABLE 5 comparison of results on a VisDA-C (RSUT) dataset for a class imbalance passive domain adaptation task 1

TABLE 6 comparison of results on a VisDA-C (RSUT) dataset for a class imbalance passive domain adaptation task 2

TABLE 7 comparison results on task-like DomainNet datasets for unbalanced-like passive domains 1

TABLE 8 comparison results on class-unbalanced passive domain task adapted DomainNet datasets 2

Other parts of this embodiment are the same as any of embodiments 1 to 3, and thus are not described again.

Example 5:

on the basis of any one of the foregoing embodiments 1 to 4, the present embodiment proposes an unsupervised domain adaptive classification system as shown in fig. 2.

The unsupervised domain adaptive classification system comprises: the imaging unit is used for acquiring image samples in different fields; the data storage unit is used for storing the image samples in the different fields; a neural network unit comprising a source domain model trained with labeled source domain data and a target domain model trained with unlabeled target domain data; the data processing unit is used for training a source domain model based on the labeled source domain data, constructing a target domain model based on the source domain model, extracting features and classification probability vectors of label-free target domain data based on the target domain model, calculating neighbor alignment loss, regular loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss based on the features and the classification probability vectors, calculating a final target function based on all the losses, updating target domain model parameters based on back propagation of target function iteration, and taking the target domain model obtained through final training as an image classification model of the label-free target domain data.

Further, the data processing unit calculates a neighbor alignment loss based on the features of the target domain data, calculates a regularization loss and a dispersion loss based on the classification probability of the target domain data, calculates a cross-view neighbor alignment loss and a cross-view neighbor alignment loss based on the classification probability of the high confidence sample and the low confidence sample, weights the neighbor alignment loss, the regularization loss, the dispersion loss, the cross-view neighbor alignment loss and the cross-view neighbor alignment loss in a linear weighting manner, and calculates a final objective function for updating the target domain model.

The above provides a detailed description of an unsupervised domain adaptive classification system provided by the embodiment of the present invention. The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Those of skill would further appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Other parts of this embodiment are the same as any of embodiments 1 to 4, and thus are not described again.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and any simple modifications and equivalent variations of the above embodiment according to the technical spirit of the present invention are within the scope of the present invention.

Claims

1. An unsupervised field adaptive classification method is characterized by comprising the following steps:

step S1: collecting image data with a label source domain and image data without a label target domain, and taking the collected image data as training data;

step S2: establishing a deep convolution network model comprising a feature extractor and a linear classifier, carrying out image classification on the labeled source domain image data, and training according to training data to obtain a source domain model;

and step S3: establishing a target domain model according to the source domain model, and initializing the target domain model by using the parameters of the source domain model;

and step S4: extracting the features of the image data of the target domain, searching the neighbor features of each image data according to the extracted features, and calculating the neighbor alignment loss by taking the classification probability of the neighbor features as a supervision signal;

2. The unsupervised domain adaptive classification method of claim 1, characterized in that the extraction manner of the features and classification probabilities for extracting the target domain image data is: establishing a feature queue F and a classification probability vector queue P with fixed capacity, randomly sampling image data in each training iteration of a target domain model, calculating new features and new classification probabilities of the sampled image data, and putting the new features and the new classification probabilities into corresponding positions of the queues according to indexes of the sampled image data.

3. The unsupervised domain adaptive classification method according to claim 2, wherein the specific operation of calculating the neighbor alignment loss in the step S4 is: random selection of image data samples x in the target domain _i ^t From image data samples x _i ^t Characteristic f of _i ^t Searching the feature queue F for the data sample x _i ^t The nearest m neighbors of cosine of (1) are represented by N _m ⁱ Index set representing storage neighbors from target domain data samples x _i ^t Classification probability vector p of _i ^t A weighting factor lambda, a number n of randomly sampled target domain data samples _b N neighbors of each neighbor, in a queue P _j 、P _k Calculating neighbor alignment loss L _nc 。

4. The unsupervised domain adaptive classification method according to claim 3, wherein the specific operation of calculating the regularized loss in the step S5 is: minimizing randomly selected image data samples x _i ^t Is classified into a probability p _i ^t And classification probability p _i ^t Class probability P of being in queue P _i According to the number n of randomly sampled image data samples _b Minimized target domain data sample x _i ^t Calculating the regular loss L _self 。

5. The unsupervised domain adaptive classifier of claim 4The method is characterized in that the specific operation of calculating the dispersion loss in the step S5 is as follows: computing randomly selected image data samples x _i ^t Average classification probability vector of

Increasing the average classification probability

Entropy of vector, according to mean classification probability after entropy increase

Calculating the dispersion loss L _div 。

6. The unsupervised domain adaptive classification method of claim 1, wherein the operation of dividing the target domain image data set according to the classification probability in step S5 is to: before traversing the whole target domain image data set every time, firstly calculating a classification probability vector and a pseudo label of each image data according to a target domain model, secondly dividing the image data into sets corresponding to various categories according to the pseudo labels, then calculating the information entropy of the classification probability vector, selecting the first r% low-entropy image data in each category set as high-confidence-degree samples, taking a set formed by the high-confidence-degree samples as a reliable sample set, and taking a set formed by the rest samples as a weak-reliability sample set.

7. The unsupervised domain adaptive classification method according to claim 1, wherein the specific operation of calculating the cross-view alignment loss in step S5 is: randomly selecting reliable samples from the reliable sample set, carrying out weak enhancement and strong enhancement on the reliable samples to obtain view 1 data and view 2 data, and obtaining a classification probability vector p corresponding to target domain data with index i _i Calculating the classification probability vector of the view 1 data in the queue P, the classification probability vector of the view 2 in the queue P and the number of the selected reliable samples to obtain the cross-viewLoss of alignment L _cv 。

8. The unsupervised domain adaptive classification method according to claim 7, wherein the specific operation of calculating the cross-view neighbor alignment loss in the step S5 is as follows: searching for neighbors of reliable samples in the queue F, strengthening the leadership of the reliable samples in the field adaptation process, and calculating the cross-view neighbor alignment loss L according to the classification probability vector of the view 1 data in the queue P, the classification probability vector of the view 2 in the queue P, the weighting coefficient lambda and the number of the selected reliable samples _cvn 。

9. The unsupervised domain adaptive classification method according to claim 1, characterized in that said step S6 specifically operates as: and (5) linearly weighting the neighbor alignment loss obtained in the step (S4) and the regular loss, the dispersion loss, the cross-view alignment loss and the cross-view neighbor alignment loss obtained in the step (S5) to obtain a final target function, updating target domain model parameters according to the iterative back propagation of the final target function, and taking the trained target domain model as an image classification model of the label-free target domain data.

10. The unsupervised domain adaptive classification method according to claim 1, wherein the features extracted by the feature extractor in step S2 are subjected to L2 regularization, and the parameter vectors of the linear classifier are subjected to L2 regularization.

11. An unsupervised field adaptive classification system is characterized by comprising an imaging unit, a data storage unit, a neural network unit and a data processing unit;

the imaging unit is used for acquiring image samples in different fields;

the data storage unit is used for storing the image samples of the different fields;

the data processing unit trains a source domain model based on the labeled source domain data, constructs a target domain model based on the source domain model, extracts features and classification probability vectors of label-free target domain data based on the target domain model, calculates neighbor alignment loss, regularization loss, dispersion loss, cross-view alignment loss and cross-view neighbor alignment loss based on the features and the classification probability vectors, calculates a final target function based on all losses, iteratively and reversely propagates and updates target domain model parameters based on the target function, and takes the target domain model obtained based on final training as an image classification model of the label-free target domain data.

12. An apparatus comprising a processor, a memory; the memory for storing a computer program;

the processor, configured to implement the unsupervised domain-adapted classification method according to any of claims 1-10 when executing the computer program.

13. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the unsupervised domain-adapted classification method of any of claims 1-10.